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THE EFFECTS OF A “CAUSAL” TEACHER- 
TRAINING PROGRAM AND CERTAIN CUR- 
RICULAR CHANGES ON GRADE 


SCHOOL CHILDREN’ 


RALPH H. OJEMANN, EUGENE E. LEVITT 


WILLIAM H. LYLE, Jr., MAXINE F. WHITESIDE 
Child Welfare Research Station 


THE PURPOSE of this paper is to report the 
results of a learning program designed to help 
the child develop a ‘‘causal’’ orientation toward 
his social environment. The learning program 
used in this study involved both the training of 
teachers and the use of certain special curric- 
ular content. 

The meaning of the term ‘‘causal’’ as used 
here has been detailed in an earlier publication 
(4). Briefly, it recognizes that human behavior 
is produced by many factors and that one candis- 
tinguish between an approach to a given behavior 
incident which recognizes and takes into account 
the variety of factors that may have produced it 
as compared with an approach that considers 
mainly the overt form of the behavior. 

The use of the term ‘‘causally oriented cur- 
ricular content’’ arises from the discovery that 
present curricular content relating to human be- 
havior as found, for example, in current social 
studies readers and texts is largely non-causal- 
ly or surface oriented (7, 8). 

The importance of specifying the learning pro- 
gram as involving the training of teachers rests 
on the following: Previous data have sug ge sted 
that teacher behavior toward children is essen- 
tially non-causally oriented, Since our culture 
is for the most part likewise oriented and since 
teachers have come up through that culture, this 
situation is not unexpected. But the tendency to- 
ward a non-causal orientation in the daily behav- 
ior of the teacher becomes important when we 
consider the problem of developing a causal ori- 
entation in the child. This may be explained as 
follows: 

When we teach arithmetic we can conceive of 
a situation in which the teacher would teach the 
child to perform the various number operations 
accurately while at the same time he (the teach- 
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er) would make a number of ‘‘mistakes’’ on his 
income tax report. The child need never see 
these ‘‘mistakes’’ and thus they would not direct- 
ly influence his learning. 

But in teaching an approach to human behav~ 
ior the situation is different. The teacher must 
of necessity interact with the pupil. Through the 
approaches he makes to the pupil he provides a 
demonstration from which the pupil learns. If he 
approaches the pupil in a non-causal way, the pu- 
pil is experiencing a demonstration of a non- 
causal approach. 

Thus, in the area of human behavior the teach- 
er teaches in two ways: He teaches through the 
content studied and through the daily demonstra- 
tions he provides. Ina previous study (10) evi- 
dence was obtained indicating that it is difficult 
to develop a causal orientation if the regular 
classroom teacher and content remain essential- 
ly non-causally oriented and causal content is in- 
troduced for, say, one period a day by a trained 
but ‘‘imported’’ teacher. 

Testing the effects of a learning program us- 
ing trained teachers and causally oriented curric~ 
ular content involved; (a) a training program for 
teachers, (b) a plan for changing curricular con- 
tent, (c) an appropriate experimental design, 
and (d) the gathering of data and analyses of re- 
sults. Each item will be briefly described in 
turn. 

The subjects of this study were four teachers 
and their pupils, each classroom matched with 
two control groups. One of the teachers was fron. 
the fourth grade, one from the fifth andtwofrom 
the sixth grade. All were from the school sys- 
tem of a midwest industrial town of about 75,000 
population. Since this investigation is part of a 
long range program, it was desired to develop a 
group of trained teachers who gave promise of 
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remaining in the system for several years. Ac~- 
cordingly teachers were selected on this Dasis bv 
the school administration in consultation with the 
investigators. Data relative to the experimental 
subjects will be presented in a descriptionof the 
experimental design. 


eacher in 


Our plans involved providing teachers with 
one month of intensive work during the summer 
and following through with group conferences 
every three weeks during the school year. These 
conferences were intended to give the teachers 
opportunity to discuss any questions or problems 
which might arise during the year as closely as 
possible to the time they might arise. 

The month's program of intensive work was 
set up under circumstances similar to the usual 
academic situation. Limited credit ona minor 
problems basis was allowed for those teachers 
who indicated that they wished to receive aca- 
demic credit, The program was organized in 
terms of six units, all but one to be completed 
during the four week period. The description of 
the units, the time devoted to each, andthe rea- 
son for their inclusion in the program are pre- 
sented below: 


Unit 1. Developmental Problems of the Normal 
Child—three hours per week. The primary 
purpose of this unit was to draw the teachers’ 
attention to the fact that ‘‘having problems’’ 
does not necessarily make a child a problem 
child. Emphasis was upon the kinds of devel- 
opmental tasks children face at various ages, 
the kinds of basic learnings which are neces- 
sary for proper handling of these tasks, and 
the problems which are created when tasks ap- 
propriate to a particular age level are not 
learned before the following level. Selected 
portions were assigned of F. Redland W. W. 
Wattenberg, Mental Hygiene in Teaching; R. 

J. Havighurst, Human Education and Develop- 
ment, Association for Supervision and Curric- 
ulum Development Yearbook, 1950, Fostering 
Mental Health in Our Schools; and Gladys Jenk- 
ins et. al., These Are Your Children. It was 
our intention for teachers to understand from 
these materials that children are continually 
facing problems and that problems are a nec- 
essary result of the child's expanding socia! 
environment. Instructors: 8. L. Zelen and 
C. D. Smock. 


Unit 2, Personal Problems of Everyday Life— 
five hours per week, This unit was set up, 
but not labeled, as 20 one-hour sessions of 
group psychotherapy. It was presented to the 
teachers as an opportunity to ‘‘extend the in- 
dividual’s understanding of the problems 
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people ordinarily meet in their daily living; 
to acquaint the individual with general psycho- 
logical principles which have maximum rele- 
vance to these problems both with regard to 
handling personal problems which exist cur- 
rently and the off-setting of present behavior 
treuds which could conceivably lead to future 
problems; and to assist in the development of 
personal techniques for meeting the frustra- 
tions which most of us normally encounter. ’’ 
The preparation of an extensive personal au- 
tobiography was required following essential- 
ly the lines presented in Stogdill’s Mental Hy- 
giene workbook entitled Objective Personalty 
Study. Extensive comments were made about 
the material contained in the units of the work- 
book which were intended to stimulate think- 
ing about their personal experiences. The 
teachers were encouraged to explore the ex- 
tent to which their own personal biases and 
predilections might structure the classroom 
situation in the hope that this might minimize 
the extent of influence of that bias. No indi- 
vidual sessions were held with members of the 
group except when they presented themselves 
to ask for individual discussion, at which time 
they were encouraged to raise their questions 
for discussion in the group. That is, an ex- 
plicit attempt was made to focus attention on 
the group situation and to bring discussion 
material to the group rather than to take ma- 
terial away for individual sessions. Members 
of the group were assigned collateral reading 
from Philip Eisenberg, Why We Act As We Do, 
and from Hugh Cabot and Joseph A. Kahl, Hu- 
man Relations, Volume I, Concepts. These 
readings plus the autobiographical material 
provided the vehicle for group discussion. In- 
structor: W. H. Lyle. 


Unit 3. Action Research in the Classroom. —two 
hours per week. An essential part of this 
unit was an attempt to discourage the teacher 
from having too much confidence in her obser- 
vations and her ability to predict from them. 
Data were presented on the problems involved 
in the determination of the reliability of obser- 
vational techniques and the predictive effic- 
iency of these observations. Some methods 
for placing her own observations ina research 
framework were presented and the teachers 
were encouraged to make their observations 
in a somewhat more systematic manner. Se- 
lected papers were used as source material 
and no outside reading was assigned. Instruc- 
tor: E. E. Levitt. 


Unit 4. The Causal Approach to an Understand- 
ing of Human Behavior—two hours per week. 
The primary purpose of this unit was to ac- 
quaint the teachers with the background of the 
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project, its origin, and its present status. An 
additional function of this unit was toacquaint 
them with the special materials which had 
been developed by the project. Instructor: R. 
H. Ojemann. 


Unit 5. Meeting Classroom Problems—three 
hours per week. This unit was under the di- 
rection of an experienced classroom teacher 
who had been working with the projectfor the 
past two years and had had direct experience 
in classroom situations. This was a technique- 
centered unit. That is, our attempt was to 
help the teacher to utilize known techniques 
in the handling of classroom problems and to 
develop special techniques which would allow 
her to meet the daily problems arising in the 
classroom, ‘‘Typical’’ classroom situations 
were presented to the teachers to give them 
some experience in understanding what would 
be surface ways of handling these situations 
as opposed to possible causal methods. The 
probable effects of the methods were com - 
pared. Our concern was with individual 
needs, but it was our feeling that most would 
be accomplished if group and individual needs 
were met jointly. Many previous attempts to 
encourage the teacher to take individual needs 
of children into consideration have failed to 
consider that this can only be accomplished, 
or at least accomplished most effectively, 
within the framework of good group control. 
In this manner the constructive forces of the 
group are at the disposal of the teacher in 
meeting individual problems. All of the ma- 
terials developed and used by the project pre- 
viously were discussed with the teachers. In 
a sense, this unit might be considered as a 
practicum companion for Unit 1, since assis- 
tance in the handling of developmental tasks 
formed an important part of this unit. Instruc- 
tor: Mrs. Maxine Whiteside. 


Unit 6. Practicum in the Preparation of Special 
Materials—two hours per week. This unit 
represents our attempt to insure two-way com- 
munication. The project personnel felt the 
need of assistance in the adaptation of mater- 
ials to be used. It was our belief that those 
individuals closer to the teaching situation 
might adapt materials to the classroom situa- 
tion more effectively both from the point of 
view of interest and appropriateness. The 
participating teachers were encouraged to 
write materials to replace those we had devel- 
oped, to extend such materials, and to develop 
new materials utilizing the strengths in their 
own professional background. All materials 
were discussed with the project member act- 
ing as advisor and the joint suggestions incor- 
porated. For the most part, this proved to 
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be a continuing project on which the teachers 
worked during the entire year. Instructors: 
Staff. 


Conferences with Experimental ‘Veachers 
During School Year 


Twelve meetings were scheduled during the 
school year or approximately one meeting every 
three weeks. The general purpose of these meet- 
ings was to provide an opportunity for the teach- 
ers to ask questions concerning the classroon. 
work they were doing. It was recognized that 
the actual practicing of the causal approach in 
the classroom would give rise to more specific 
questions which could not be fully anticipated 
during the summer training program, In addi- 
tion, the meetings furnished the staff with an op- 
portunity to discuss additional topics with the 
teachers. 

One or more members of the staff led discus- 
sions on various topics which can be grouped un- 
der seven general headings. 


Materials—At each meeting the teachers 
were given an opportunity to ask any questions 
about the materials they were using. At six of 
the meetings, questions were presented and dis- 
cussed, Other meetings were used for extend- 
ing teachers’ background in child behavior and 
discussions relative to practicing the causal ap- 
proach in the classroom. 

At one meeting toward the close of the pro- 
gram the teachers were asked to suggest, onthe 
basis of their experience, the teaching sequence 
for using the materials. A discussion of the 
merits of teaching one type of material before 
another and the like, resulted in agreement as 
to the most useful sequence according to their 
classroom organization. 

—One of the main topics of the first 
meeting was a discussion of specific classroom 
situations which the teachers had faced, com - 
ments on the surface and causal methods to 
handle such situations plus a description of the 
way the teachers had handled the situations. 
Part of nine other meetings was spent discuss- 
ing this topic. In this way, the teachers were 
given an opportunity to check their own behavior 
as surface or causal as well as the behavior of 
the pupils. 

For example, one teacher had been obs er v~- 
ing a girl who seemed to play with no one, who 
stayed by herself but had made it known that 
she wanted to associate with others. Meetings 
with the parent, conversations with the pupil 
were described after which teachers asked ques- 
tions to obtain additional information, such as 
the teacher's hypothesis concerning the causes 
of the described behavior. The group then made 
and evaluated recommendations for possible 
methods of dealing with this situation. 
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Records —At the seventhand twelfth meet- 
ings time was devoted to discussing what infor- 
mation the teachers would like to have about 
their pupils in order to better practice the caus- 
al approach in the classroom. 

dditional backg r d c —_ 
The teachers were given an opportunity to ques- 
tion members of the staff relative to the findings 
of investigations of a variety of behavior patterns. 
The teachers’ questions arose primarily from 
observations of pupils in their rooms which 
prompted them to inquire about studies which might 
further their understanding of the pupils. Though 
some background had been provided during the 
summer program, a re-presentation was advan- 
tageous because of the teachers’ actual observa- 
tion of the behavior being discussed. 

For example, one question was “Would 
you discuss the ‘shy child’ in general and then 
consider a specific case which I will describe?"’ 

—At the second meeting the teaciu-~ 
ers were asked to assist in the preparation of a 
‘‘Tentative List of Outcomes Which Might Be Ex- 
pected as a Result of Teaching the Causal Ap- 
proach.’’ After a tentative list had been pre - 
pared they were asked to refer to it often during 
the school year and then toward the end of the 
second semester, select the outcomes which 
they felt might be a result of teaching the causal 
approach at their particular grade level. The 
purpose of this exercise was to utilize the teach- 
ers’ experiences in making a tentative estima- 
tion as to what aspects of the causal approach 
may be developed at the respective grade levels. 

Eyaluation—Toward the middle of the year 
the teachers were asked to evaluate the training 
program of the previous summer by answering 
the following questions: 

1. What do you feel were the most valuable 
parts of the training program last summer? 
Please list at least two or three, withcomments. 

2. Ifa new group of teachers were to be 
trained, what changes in the training program 
would you suggest? 

Results—The last meeting was primarily con- 
cerned with the presentation of the statistical an- 
alysis of the results of the program. 


Development of Curricular Content 


As indicated above, previous studies had dem- 
onstrated that content dealing with human behav- 
ior as currently found in elementary readers, so- 
cial studies and health texts is essentially surface 
or non-causal in nature. It was, therefore, nec- 
essary to develop more causally oriented content. 
To accomplish this a variety of materials were 
prepared. Some of these materials were avail- 
able from previous studies, some were prepared 
during the course of this investigation. 

In describing the preparation of causal con- 
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tent a statement of the concepts and appr ecia- 
tions whichconstitute the goals of the lear ning 

program may facilitate discussion. We wish to 

help the child to understand and appreciate more 
about how his social environment operates. He 

is taught that there are many ways inwhich a 

given behavior pattern may develop, that causes 

are complex, that people are faced with many 

different situations which they are trying to work 
out, that they use a variety of methods for this, 

that additional methods may be available and that 
all the methods may be considered in terms of 

the effects they have. 

In contrast to such concepts as these, chil- 
dren under present usual conditions are taught 
essentially what people do and primarily a judg- 
mental approach to the behavior without first 
seeking an understanding of how it came about. 

The situation appears somewhat comparable 
to that which prevailed in man’s reaction to his 
physical environment. At one time man took a 
more or less arbitrary approach to his physical 
environment. It is only relatively recently when 
we consider the span of human history that he 
learned a more dynamic approach. 

A list of some of the elementary concepts rep- 
resented in the causal orientation has also been 
reported elsewhere (4). 

The nature of the curricular content is further 
revealed by the specific materials developed. It 
is possible here only to list the various types. 
Readers who are interested in examining the ma- 
terials at first hand may obtain copies from the 
investigators. 

The types of materials are as follows: 


1. od 
ethod—the ‘‘Teachers Manual for Behav- 

ior Materials in the Primary Grades”’ is a col- 
lection of twenty-seven stories grouped in sec - 
tions for use at different grade levels. Each 
story deals with a particular behavior pattern. 
Preceding each story the manual supplies some 
background for the teachers. These materials 
have been described in earlier publications (7). 

The story is introduced and read by or to the 
pupils and is followed by a discussion designed 
to guide the pupils into thinking of the ‘‘reasons 
for the behavior’’ which were described in 
the story. The teacher keeps two general ques- 
tions in mind during discussions: 

1) Did the children understand the differences 
between thinking of causes and not thinking 
of causes ? 

2) Did the children gain ideas of ways to meet 
ordimry problems so as to help each par- 
ticipant grow? 

Stories for use in the intermediate grades 

were also written with a broader scope than those 
for use in the primary grades. 


2. Expository presentation of causal approach 
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as it applies both to development of behavior and 


the consideration of the effects of behavior—two 

pamphlets bearing the titles ‘‘Two Ways to Look 
at How People Act’’ and ‘‘When We have to De- 
cide’’ provide in expository form the differences 
between the surface and causal approaches. 

3. A series of workbooks which served as in- 
troductory units to social studies and health: 

Book I: How considering causes affects 
our reaction to behavior 

Books II and III: How people work out feelings 
of self-respect and ‘‘counting-for- 
something’’ 

Books IV and V: How physical differences, ex- 
periences and opportunities may 
affect different people 

Book VI: How past experiences affect methods 
people use 

The booklets provided a variety of exercises 
to be written out, unfinished situations for which 
endings were to be written or role-played and 
the like. 

4. Revised units in history and geography— 
sections of history and geography were revised 
to incorporate the elementary principles of hu- 
man behavior. For example, the unit on ‘‘The 
South’’ was revised to include discussions of how 
geographical and cultural conditions may influ- 
ence the situations people face and the m ethods 
they employ to work them out. 

5. Units on the use of the room council—the 
material prepared by Stiles (9) for helping pupils 
apply the causal approach in room council dis - 
cussions has been described in previous publica- 
tions. 


In the preparation of these materials, consid- 
erations were given to pupil interest, pupil ex- 
periences and vocabulary burden. The Dolch, 
Buckingham ‘‘Combined Word List’’ and Green’s 
‘*lowa Spelling Scale’’ aided in checking vocabu- 
lary in material to be read by pupils. Listening 
vocabulary was scaled higher than reading vocab- 
ulary in recognition of the differences between 
the two. 

Relating new situations or experiences to fam- 
iliar ones is a technique often used by teachers. 
This practice was taken into consideration in 
the writing of materials with one precaution. As 
is explained in the teachers’ manual of primary 
materials: ‘‘Since every child is engaged in work- 
ing out his own problems, it was felt that if the 
material dealt only with school and community 
situations of children like themselves, they may 
become so engrossed in their immediate prob- 
lems that they miss the larger more objective ap- 
preciation. Accordingly, situations involving 
children older and younger than themselves, and 
children from quite different environments as 
well as some situations involving children like 
themselves are included. ’’ 
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Since the incorporation of the causal approach 
in teaching materials is relatively new, readers 
who are interested in detail are encouraged to 
examine the original materials. Particular ques- 
tions vary with the background of the reader and 
it is not possible to anticipate all of them. Asa 
guiding principle it may be helpful to keep in 
mind that the purpose of the learning program is 
to help the child gain more appreciation how his 
social environment operates just as physical sci- 
ence attempts to build an appreciation of how the 
physical environment operates. 


Experimental Design and Analysis of Results 


The evaluation of the teacher training pro- 
gram was actually concerned with pupil devel- 
opment rather than with teacher development per 
se. There were two reasons for this approach: 
a) the primary motive for the training of the 
teachers was to affect the pupils in certain ways, 
b) the number of teachers was obviously too 
small to permit any reliable measurement of 
teacher characteristics directly. The evaluation 
procedure is described in detail in the fo llowing 
sections. 

Control teachers——Two control teachers were 
selected for each of the four experimental group 
teachers. The control teachers were matched 
with the respective experimental teacher, inso- 
far as it was feasible, on a number of dimensions 
which might affect experimental results. These 
variables were age, sex, number of years of 
teaching experience, and educational level. The 
data are shown in Table I. The control teachers 
were selected from the same schoolsystem. The 
twelve teachers represented ten different elem- 
entary schools. 

It would have been desirable to have beenable 
to control other potentially pertinent factors like 
teaching ability. However, an analysis of the 
available literature indicates that such expres- 
sions as teaching ability are rather nebulous and 
not easily defined or measured. It seemed pref- 
erable to deal with concrete measures and to as- 
sume that meaningful uncontrolled variables 
were randomly distributed among the groupsof 
teachers. 

In addition to their training, the experimental 
teachers had been provided with various mater~- 
ials for use in teaching the ‘‘causal approach’’ in 
the classroom. The purpose of the double con- 
trol group was to attempt to determine the effec- 
tiveness of these materials alone. Toward this 
end, the teachers in Control, were invited to se- 
cure and make use of such of the materials as 
they wished. The purposes and modes of use of 
the materials were outlined briefly. Their use 
was not, however, required of tha Control, 
teachers and no attempt was made to insure 
that they were used. A check on the kinds of 
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TABLE I 


COMPARATIVE BACKGROUND DATA OF EXPERIMENTAL AND 
CONTROL TEACHERS 


Years 
Teaching 


Experimental 
Control, 
Control, 


Experimental 
Control, 
Control, 


Experimental 
Control, 
Control, 


Experimental 
Control, 
Control, 


TABLE 
MEAN IQ SCORES BY CLASS 


Fifth Sixth Sixth 
Grade Grade (I) Grade (II) 


108. 47 111.74 105. 68 


110. 08 101. 64 101.20 
109. 94 108. 63 107. 88 
109. 53 106.78 104. 40 


100 ee (Vol. 24 
| 
Educational 
Teacher Age Sex S| Level 
Fourth Grade 

26 F 5 B.A. 

27 F 5 B.A. 

26 F 4 B.A. 

Fifth Grade 

52 F 36 B.A. 

50 F 30 B.A. 

52 F 32 B.A. 

Sixth Grade (I) 

44 7 23 B.A. : 

40 F 17 B.A. 

50 F 28 B.A. 

Sixth Grade (II) 

26 F 5 B.A. 

28 F 5 B.A. 

26 F 4 B.A. 

Fourth 
Grade Total 

Experimental 110. 89 ee 109. 20 
Control, 109. 76 105. 72 
Control, 106. 31 108.19 
Total 109.53 107. 48 
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material actually used and the number of hours 
devoted to them was made at the conclusion of 
the investigation (see page .10). 

All eight of the control teachers expressed a 
desire to be in Control,, i.e., were apparently 
interested in the materials. Teachers were as- 
Signed at random to the two control groups. The 
untreated control group, i.e., the one in which 
the teachers had no contact whatsoever with the 
experimenters, is designated as Control,. 

The pupils—-The matching of teachers need 
not, of course, have any affect onthe disposition 
of pupils within the various groups of classes. It 
was necessary to be reasonably certain that the 
pupils in one or another of the groups were not 
superior in any relevant way. Age and sex were 
controlled automatically by the methods ofassign- 
ment of pupils to classes in the school system. 
Some discrepancies in the sex ratio might occur, 
but earlier work with the tests to be used have 
revealed no systematic sex differences in 
performance. 

Intelligence is likely to be an important factor, 
as it usually is in studies of this type. IQ scores 
on the Otis Self-Administering Test, Intermedi- 
ate form, were secured from school records for 
all pupils who participated in the testing program. 
The mean IQ scores by class are shown in Table 
II. The results of a treatment-by-grades analy- 
Sis of the variance of IQ scores are shown in 
Table IT. 

The analysis of variance |* of IQ scores yields 
entirely negative results. There are hence no 
Significant differences in intelligence either 
among the three treatment groups, or among 
grades, or among individual classes. We shall 
not, therefore, be able to attribute differences 
in performance to intelligence. 

We have been able to control age, sex, andin-~ 
telligence among the pupils inthe various classes. 
It is entirely possible that there are other, un- 
known variables which are pertinent to the exper- 
imental design, a not uncommon occurrence. A- 
gain we shall assume that the pupils are random- 
ly distributed among the groups with respect to 
such variables. 

The tests—-Two instruments were used inthe 
evaluation proper. The first of these, the Prob- 
lem Situations Test (PST), has been the subject 
of considerable investigation. Its development 
is described elsewhere (5). The PST isa 22-item 
multiple-choice test in which the subject is faced 
with a number of instances of misbehaviors or 
deficiencies of children and is required to deal 
with them either from the point of view of an au- 
thority figure or from his ownpoint of view. 
There are six possible responses for each situa- 
tion, three punitive and three non-punitive. The 
punitive responses prescribe verbal or physical 
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punishment, deprivation, or coercion. The re- 
sponses were obtained from an open-end form of 
the test administered earlier to a group of fifth 
grade children. 

The PST is considered to be a measure of 
punitiveness in the child, that is, his willingness 
to be immediately punitive in a hypothetical situ- 
ation where no retaliation is anticipated. The 
score for punitiveness is the number of punitive 
responses to the 22 situations. The PST has 
been shown to be related to authoritarianism and 
parental disciplinary methods (6) and to extra- 
punitiveness and intrapunitiveness as measured 
by the Rosenzweig Picture-Frustration Study (3). 
The reliability of the PST has been estimated as 
.77 using the Kuder-Richardson formula 20, 
based on data obtained from the earlier studies. 
In a sample of fifth grade pupils the correlation 
between the PST and IQ was found to be -.29 (6) 
which indicates that only a small fraction of the 
variance of the test is due to intelligence. 

The second instrument was the Causal Test 
(CT). The CT has not yet been widely investigat- 
ed, though it appears to have considerable prom- 
ise. It is a 30-item true-false type, the individ- 
ual items being based on eight descriptions of 
behavior. The test attempts to tap the child's 
awareness of the dynamic, complex, variable na- 
ture of human motivation, though it does not re- 
quire that he have any specific knowledge of the 
causes of behavior themselves. This awareness 
and its hypothesized behavioral concom itants has 
been called ‘‘causality’’ (4). The test is scored 
inversely, i.e., for non-causality, so that the 
higher the score, the less causal the subject. 
This was done so that the CT would vary direct- 
ly with the PST. The CT has been found to cor- 
relate ~. 36 with intelligence in fifth grade pupils 
and to have a Kuder-Richardson reliability of .63. 
The latter is rather low, but it should be borne 
in mind that attitude and personality tests with 
young children cannot be expected to have relia- 
bilities of such measures with adults. 

A more detailed description of the CT will be 
found in another forthcoming publication (2). 

Experimental procedure—The tests were ad- 
ministered to all twelve classes on September 29 
and 30, 1954, approximately three weeks after 
the beginning of the fall semester. (The formal 
body of the teacher training program had ended 
in July, 1954.) This administration will be re- 
ferred to as the pre-testing. The secondadmin- 
istration, or post-test, took place on April 12 
and 13, 1955, approximately six and one-half 
months later. The tests were administered by 
three regular project staff members, each of 
whom tested the same four c'asses inSeptember 
and April. No administrator tested more than 
two classes in any one treatment group. The 
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TABLE If 
ANALYSIS OF VARIANCE OF IQ SCORES 


Grades 
TxG 
Within Cells 
Total 


TABLE IV 


LOSS OF SUBJECTS DUE TO EQUATING CLASS N’s OVER 
TREATMENT GROUPS 


Grade Experimental Control, Control, 
Problem Situations Test 


Fourth 19 (-0) 23 (-1) 17 (-1) 
Fifth 21 (-2) 22 (-0) 16 (-0) 
Sixth (I) 20 (-1) 31 (-9) 24 (-8) 
Sixth (1) 24 (-5) 28 (-6) 24 (-8) 


Total 
Eliminated 16 17 


N Remaining 
Per Class 22 16 


Causal Test 


Fourth 25 (-0) 
Fifth 26 (-1) 
Sixth (I) 31 (-6) 
Sixth (II) 28 (-3) 


Total 
Eliminated 10 


N Remaining 
Per Class 25 
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Source d.f. 8s MS F P 
Treatments 2 565. 95 282. 975 1. 762 > .10 
3 1028, 78 342. 927 2.135 -10 
6 1273.21 212. 202 1,321 >.20 
228 36621. 96 160. 623 cece eee : 
239 39489. 90 
a 
16 (-0) 
17 
23 (-7) 
25 (-9) 
16 
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tests were administered in the same order and 
no time limits were set. Despite this leniency, 
there were a number of incomplete protocols in 
every class, especially for the pre-testing. 
These were invariably discarded, 

The number of subjects who successfully com- 
pleted both pre- and post-tests varied from 
class to class for both of the measures. In or- 
der to avoid complicating an already complex 
Statistical analysis, it was necessary to equate 
the numbers of subjects either over the treat- 
ment groups or over the grade levels. The form- 
er was the technique chosen since it involved 
the smaller loss of subjects for both tests. All 
subjects were first listed randomly, then the 
required number was eliminated according to 
a table of random numbers. Table IV shows the 
original Ns, the number eliminated, and the re- 
maining N for each class and test. 

The elimination of subjects changed the mean 
scores per class only slightly, which is the an- 
ticipated result when subjects are randomly re- 
jected. For the comparison of pre- and post- 
test results there are 19 subjects in each exper- 
imental class, 16 subjects ineachControl, class, 
and 22 subjects in Control, classes for PST, and 
25 in Control, classes for the CT. The total 
number of subjects will be 228 for the PST and 
240 for the CT. 

If the teacher training program has had the 
desired results, we would expect that the pupils 
taught by the experimental teachers would show 
greater reductions in PST and CT scores than 
the pupils taught by the control teachers. If the 
use of learning materials alone has any signifi- 
cant effect, we would also expect the Control, 
classes to improve more than the Control, clas- 
ses, although, of course, this difference is not 
basic to the evaluation of the training program. 

Statistical analysis—lIn a design of this type 
we would expect to find some random (though 
insignificant) differences in pre-test scores 
among the treatment groups. Since these pre- 
test differences may have some effect on the 
post-test scores, it would be desirable to elim- 
inate them by means of some statistical tech- 
nique. Hence the appropriate statistical proced- 
ure is an analysis of covariance. 

If the data do not justify the assumptions nec- 
essary for the application of a covariance ana l- 
ysis,“ there remain two alternate analyses. The 
first of these is simply to accept the post-test 
scores as a valid index of treatment effects on 
the assumption that the lack of significance of 
differences among pre-test scores means that the 
groups were actually equated prior to treatment. 
The second is a sign test (1) based on preminus 
post-test scores, a non-parametric method re- 
quiring no assumptions. 

Results - The PST: Pre-Test—The mean pre- 
test scores on the PST are shown in Table V. 

Obviously there are arithmetic differences be- 
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tween class means, although the differences be- 
tween mean scores for the three treatments, 5.17 
for the experimental group, 5. 86 for Control, 
and 5.38 for Control, are quite close toge ther. 
The results of an analysis of variance of these 
scores are shown in Table VI. 

The analysis reveals no significant difference 
between the treatment means (F = 0.580) and no 
significant interaction (F = 1.915). Differences 
between grades are significant (F = 3.516, P = 
<.02>.01) but this is of no consequence for the 
experimental design. The absence of differences 
between treatment means and the lack of inter- 
action indicate that random sampling has been 
accomplished, That is, the classes have been 
assigned at random to the treatment groups and 
are thus well matched. We may conclude that 
this phase of the testing with the PST has been 
successful. 


The PST: Post-Test——The post-test means 
are shown in Table VII. Thepre-test means 
of Table V are included for comparative pur- 
poses. 

The experimental group, witha pre-test 
mean of 5.17, dropped to 2. 39 on the post-test. 
Control, fell from 5. 86 to 5.14, a change of less 
than three-quarters of a point. Control, rose 
slightly, from 5.38 to 5.67. The experimental 
classes show a unanimous decrease in mean 
score, the smallest decrease, that for the fourth 
grade, being over 1.25 points. Three of the four 
classes in Control, show decrements, although 
the overall decrease is much less than that for 
the experimental group. Two of the Control, 
classes show increases and two show decreases, 
the net being an increase of 0. 29 points. 

We now proceed to an analysis of variance of 
the post-test scores, which is shown in Table 
Vil. 

We find on post-test that the difference be - 
tween treatment means is now highly signifi- 
cant (F = 23.4, P=<.001). The differences by 
grades remain significant, though of no conse- 
quence. The interaction also remains insignifi- 
cant. 

Before we can proceed to adjust the post-test 
scores by covariance, it is necessary to testfor 
homogeneity of regression, a key assumption in 
the application of covariance. For this test we 
break down the adjusted within cells sum of 
squares for the post-test scores into two compon- 
ents, the sum of squares for differences among 
group regression lines and the sum of squares for 
deviations from the group regression. The mean 
square for the former divided by the meansquare 
for the latter constitutes the F-ratiofor the test 
of homogeneity of regression. The degrees of 
freedom are the number of regressions minus 
one for the numerator and N minus twice the 
number of regressions for the denominator. 


For the PST, the MS for differences among 
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TABLE V 
MEAN PRE-TEST SCORES ON THE PROBLEM SITUATIONS TEST 


Grade Experimental Control, Control, Total 


Fourth 5. 05 8.32 5.81 6.53 
Fifth 4.95 3.32 4.50 4.19 


Sixth (I) 4.74 7.00 6. 63 6.14 


Sixth (II) 5.95 4. 82 4.56 5.12 
Total 5.17 5. 86 5.38 5. 496 


TABLE VI 
ANALYSIS OF VARIANCE OF PRE-TEST SCORES ON THE PST 
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Source 


Treatments 
Grades 
TxG 
Within Cells 


Total 


MEAN PRE-TEST AND POST-TEST SCORES ON THE PST 


Experimental Control, Control, Total 
Pre Post Pre Post Pre Post Pre Post 


5. 05 3.74 8. 32 6. 64 5.61 7.47 6.53 5. 89 
4.95 2.32 3. 32 3. 05 4.50 5.13 4.19 3.39 
4.74 2.00 7. 00 5.77 6. 63 6. 06 6.14 4. 60 
5. 95 1,53 4. 82 5. 09 4.56 4. 06 5.12° 3.61 
2.39 5. 86 5.38 5. 67 5.496 4.373 


104 
d. f. ss MS EF P 
2 20. 86 10. 430 0.580 > .20 
3 188. 89 62. 963 3.516 <.02 >.01 
6 205. 78 34, 297 1.915 <.10 ».05 
216 3868. 47 17.910 
227 4284. 00 
TABLE VI 
Fourth 
Fifth 
Sixth (I) 
Sixth (1) 
Total 
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group regressions is 29.639, the MS for devia- 
tions from group regressions is 5.527. The F- 
ratio is 5.363, which is significant below the 
-001 level for d.f.’s of 11 and 204. We thus re- 
ject the null hypothesis and conclude that hetero- 
geneity of regression exists among the cells. 

Regretfully, we are forced to abandon the an- 
alysis of covariance. Simply for purposes of 
completeness, it might be noted that the covari- 
ance analysis would not have changed any of the 
P-values in Table VIII very much. 

Under the heading ‘‘Statistical analysis’’, two 
alternate procedures were suggested in the event 
that the covariance analysis proved inappropri- 
ate. The first of these was to interpret the an- 
alysis in Table VI, which shows no significant 
pre-test differences between treatment groups, 
indicating that the treatment groups were equat- 
ed on the pre-test. Statistically, this is literal- 
ly true since the arithmetic differences are ran- 
dom. With this interpretation, we may now con- 
sider the analysis of the post-test scores, as 
shown in Table VIII, as our critical test. This 
type of experimental analysis is quite com mon, 
in fact more common than covariance. 

Table VIII shows that the differences among 
treatments and among grades are significant, the 
respective Fs being 23.4 and 7.625, both signif- 
icant beyond the .001 level. The interaction is 
not significant. The next step is to test the dif- 
ferences among pairs of treatment means. These 
means will be found in Table VII. They are 2.39 
for the experimental group, 5. 14 for Control, 
and 5.67 for Control,. The t-tests of differ- 
ences between means are shown in Table IX. 

The experimental group is clearly lower in 
mean score than either of the two controls. The 
control groups, however, do not differ from each 
other. This general finding is also true of the 
individual class means as shown in Table VII. In 
each case the experimental class has the lowest 
mean. In three of the four grade levels, Con- 
trol, is lower than Control, though the differ- 
ences are numerically small. 

The second suggestion wasa signtest in which 
each pair of pre-test and post-test scores are 
compared for direction of change. A plus would 
indicate that the post-test score was larger than 
the pre-test, a minus the reverse, anda zero, no 
change. A chi-square is then applied to the fre- 
quencies. 

Table X shows the frequencies of plusses, 
minuses, and zeros for the sign test for the 
three treatment groups. 

The chi-square obtained from Table X is 
33.46, which is significant below the . 0001 level 
for four degrees of freedom. The results are 
clearly in favor of the experimental group which 
has the largest number of minuses and the small- 
est number of plusses. Control, has the next 
largest number of minuses and the next smallest 
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number of plusses. The trend is revealed more 
clearly by breaking down Table X into its three 
individual chi-squares, of which only two need 
be computed for our purposes. Comparing the 
experimental group with Control,, we obtain a 
chi-square of 13.13, which is significant below 
the .005 level for d.f, = 2. Comparing Control, 
with Control,, the chi-square is 8.21, d.f. = 2, 
and P = <.02>.01. In other words, the exper- 
imental group appears to have been most affect- 
ed by the treatment, Control, next most affect- 
ed and Control, least affected. Control, in fact 
shows almost exactly the number of minuses that 
would be expected by chance alone, 

Reliability of the PST-—-An estimate of test- 
retest reliability can be obtained by correlating 
the pre- and post-test scores for Control,, the 
untreated control group. For this purpose we 
can utilize data from the abandoned covariance 
analysis, a procedure which will provideanover- 
all r with systematic differences among grade 
means eliminated. 

The test-retest correlation turns out to be .71. 
This is a respectable reliability witha group 
which includes a fair smattering of 9-year-olds. 
Furthermore, the hiatus between testand re-test 
was over six months and it is customarily no 
more than a week or two for test-retest reliabii- 
ities. The unusually long gap ordinarily has a 
tendency to attenuate the correlation. 

The CT: Pre-Test—-The mean pre-test 
scores for the CT are shown in Table XI. 

The analysis of variance of the pre-test scores 
is shown in Table XII.__.. 

As in the case of the PST, the variance due 
to treatments is insignificant (F = 1.734) while 
that for grades is significant (F = 11.146, P= 
<.001). However, for the CT the interaction 
variance is also significant (F = 2.752, P=<. 03), 
a disturbing occurrence, since it indicates that 
the sampling of classes is non-random. The 
source of the interaction seems obvious; the 
means for fourth and fifth grades in Control, and 
for the experimental sixth grade (I) are atypical 
when compared with means in the same level or 
in the same treatment. 

Lack of randomness is not unexpected when 
intact school classes are assigned to treatments. 
It is an unfortunate happening in factorial design, 
but in this particular case we need not yet be 
overly concerned, If the treatment effects are 
exceedingly strong, it is quite possible that the 
adjusted post-test scores will not have a signifi- 
cant interaction, In that case all will be well. 
If, however, the interaction effect remains, then 
the within cells MS will no longer be the appro- 
priate error term for testing the main effects 
and the analysis will be very coarse and prob- 
ably unrevealing. 

The CT: Post-Test—The post-test means 
are shown in Table XIII. The pre-test means 
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TABLE VI0 
ANALYSIS OF VARIANCE OF POST-TEST SCORES ON THE PST 


Source 5 MS P 
Treatments 228. 340 < .001 


Grades » < .001 


TxG > .20 


‘ 


Within Cells 


Total 


TABLE IX 


COMPARISONS OF TREATMENT GROUPS ON THE 
PST POST-TEST 


Comparison t 


Experimental - Control, 5. 68 
Experimental - Control, 


Control, - Control, 


TABLE X 


SIGN TEST ANALYSIS OF PRE- MINUS POST-TEST SCORES 
ON THE PST 


Minus 


Experimental 
Control, 
Control, 


Chi-square « 33,46; d.f. = 4; P =< .0001 
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216 2107. 82 9. 758 
227 2869. 31 
P 
i < .0001 
<= .0001 
1. 03 30 
Plus P| Zero Total 
6 57 13 716 
27 48 13 88 
31 20 13 64 
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TABLE XI 
MEAN PRE-TEST SCORES ON THE CAUSAL TEST 


Experimental Control, Control, 


12.53 15. 76 14.50 
12. 05 9.24 12. 63 

8. 42 11, 48 11.13 
10, 84 10. 68 10. 88 
10. 96 11.79 12, 28 


TABLE XII 


ANALYSIS OF VARIANCE OF PRE-TEST SCORES ON THE CT 


P 


31. 785 

204. 293 

TxG 50. 440 
Within Cells 18, 328 
Total 


<.20 ».10 
<.001 
< .03 


TABLE XIll 
MEAN PRE-TEST AND POST-TEST SCORES ON THE CT 


Experimental Control, 
Pre Post Pre Post 


12.53 5.53 15.76 12,44 
12. 6 4. 63 9.24 5.88 

8. 42 3.58 11.48 9,44 
10.84 5.58 10.68 8.56 
10.96 4.83 11.79 9.08 
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Grade Total 
Fourth 14, 40 
Fifth 11.03 
Sixth (I) 10, 42 
Sixth (01) 10, 78 
Total ll. 658 
Source d, f. ss MS F | 
1,734 
11. 146 
2.752 
—Contro, Total 
Grade Pre Post Pre Post 
Fourth 14.50 12.56 14.40 10,28 
Fifth 12.63 11.25 11. 03 6. 92 
Sixth (I) 11.13 10,44 10, 42 7.85 
Sixth (I) 10. 88 6. 88 10. 78 7.17 
| Total 12.28 10.28 11.658 86.054 
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are again included for comparative purposes. 

All twelve of the classes show some dec re - 
ment, all three treatment groups show reduc- 
tions in mean score. Control, fell 2.00 points, 
Control, 2.71 points, while the experimental 
group dropped over six points, a decrease of 
more than 55 percent. The analysis of variance 
of post-test scores is shown in Table XIV. 

The variance due to interaction is clearly still 
significant (F = 5.031, P=<.001). F-ratios 
and P-values for treatments and grades were 
computed using both the within cells MS and the 
T x GMS as error terms. The treatments MS 
is significant in either case, the respective Fs 
being 40, 040 and 7,958, the respective Ps, < 
.001 and< . 025. 

The significance of differences among treat- 
ment groups is encouraging, but the persistent 
interaction is stilla problem. There is not much 
point in testing for homogeneity of regression un- 
til we determine whether or not the interaction 
will remain significant when it is adjusted by co- 
variance. Accordingly, the adjusted interaction 
MS and the adjusted within cells MS were com- 
puted, The results are shown in Table XV. 

The interaction remains significant evenafter 
adjustment, the F-ratio being 7.956, P = <.001. 
This means that the within cells MS is no longer 
an appropriate error term for testing the main 
effects. The design would be left with only 10 de- 
grees of freedom, 2 for treatments, 3 for grades, 
and 5 for interaction (since 1 d.{, is lost from 
the error term due to adjustment). Such an an- 
alysis could hardly be expected to provide signif- 
icant results unless the treatments were practic- 
ally infinitely powerful. One would hardly con- 
sider undertaking an experiment with only three 
scores in each treatment group. 

Rather than forego the increased sensitivity 
of design offered by the within cells error, the 
data were inspected in the hope of discover ing 
the source of the significant interaction. An ex- 
amination of the data in Table XIII revealed that 
the sixth grade (II) class in Control, had dropped 
significantly on the post-test. Its pre-test mean 
was 10, 88 and its post-test mean was 6.88. The 
t-score of the difference is 4.65, which is sig- 
nificant beyond the . 01 level for 14d.f. The dif- 
ference of 4, 00 points is more than twice that 
for any other class in Control, and greater than 
that for any class in Control,, the treated con- 
trol group. This class evidently contributes a 
considerable amount to the significance of the 
interaction. It does not seem conceivable that a 
single untreated control class should show asig- 
nificant decrement. It is probable that this 
class had been exposed to some uncontrolled 
‘*treatment’’ during the course of the six months 
intervening between pre- and post-tests.4 It 
was decided that sufficient grounds existed for 
dropping out this entire level from the ana lysis 
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proper, if for no other reason than to deter- 
mine statistically if this single class was, infact, 
accounble for any large part of the interaction. 
The recomputed analysis of pre-test scores 
based on 9 classes and 180 subjects shown in 
Table XVI. 

The results are almost identical with those 
of the original analysis of pre-test scores in 
Table XII. Treatment variance is insignificant 
and variances due to grades and interaction are 
significant, As in Table XII, if the T x G MS is 
used as the error term, the grades variance al- 
so becomes insignificant. 

The analysis of post-test scores is shown in 
Table XVIL 

Once again the results are practically un- 
changed. Except for minor discrepancies in P- 
values, Table XVII shows the same kind of data 
as did Table XIV, the original post-testanalysis. 
All three effects are significant when tested by 
the within cells error; the treatment variance re- 
mains significant when T x G is the error term, 
while the grades variance becomes insignificant. 

So far, the elimination of the sixth grade (II) 
level has not changed the analysis. The next 
step is to adjust the within cells MS and the T x 
G MS by covariance, as in the original analysis. 
The computations are shown in Table XVII. 

The F-ratio for the interaction is now only 
0.994, which is clearly insignificant. Acompar- 
ison of the results in Table XVIII with those of 
the original analysis in Table XV reveals the 
marked effect of the elimination of the sixth 
grade (II) Control, class. No other result is 
changed, but the interaction goes from highly 
significant to insignificant. 

Homogeneity of regression must still be dem- 
onstrated before we can proceed to make the cru- 
cial adjustments of the treatment means. The 
MS for differences among group regressions is 
14. 093 and the MS for deviations from group re- 
gression is 8.577. The F-ratio for the test is 
14, 093/8.577 = 1.643. The P-value is <.20 > 
.10 for 8 and 162 degrees of freedom. Hence, we 
may accept the null hypothesis and conclude that 
the cells have homogeneous regressions. 

Having demonstrated homogeneity of reg res- 
sion, we may now proceed to adjust the sums of 
squares for treatments and grades for the cru- 
cial test. The adjusted data, plus the data of 
Table XVII, are shown in Table XIX. 

The adjusted variance for treatments yields 
an F-ratio of 53.933, which is significant be - 
yond the .0001 level. The F for grades is 4.024, 
P=.02. There can be no but that the 
treatment variance is significant in the final an- 
alysis and we must conclude that there have 
been real treatment effects during the six and 
one-half months intervening between pre- and 
post-testings. By comparing the unadjusted 
within cells MS (14. 828) with the adjusted with- 
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TABLE XIV 


ANALYSIS OF VARIANCE OF POST-TEST SCORES ON THE CT 


Treatments 1213, 22 606. 610 40. 040 


425.55 141. 850 9. 363 


Source df. Ss MS Within 
2 
Grades 3 
6 


TxG 457.37 76.228 
Within Cells 228 3454. 16 15. 150 


Total 239 5550. 30 eovccees 


TABLE XV 


COMPUTATION OF THE POST-TEST INTERACTION ON THE CT 
ADJUSTED BY COVARIANCE 


Source d.f. Ss 


TxG 6 440.21 
Within Cells 227* 2093.39 


*One degree of freedom lost due to adjustment 


TABLE XVI 


ANALYSIS OF VARIANCE OF PRE-TEST SCORES ON THE CT 
WITH GRADE LEVEL 6 (U1) ELIMINATED 


42.735 
275. 615 
TxG 70. 068 


Within Cells 17, 843 
Total eee 
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P 
int. Within Int. 
7.958 <.001 <.025 
1. 861 < .001 >.20 
< .001 
MS F P 
9. 222 
Source d.f. 8S MS F P 
2.395 .10 
15. 458 <.001 
3.927 : 005 
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in cells MS (8.837) we see that the covariance 
analysis has nearly doubled the precision of the 
tests of the main effects, a result which makes 
the required time and effort well worthwhile. 

We can now be certain that the treatment ef- 
fects are significant, but we do not yet know 
which group or groups account for the sig nif- 
icance. To investigate this point we firstadjust 
the cell and treatment means and then apply t- 
tests to individual pairs. The adjustment of 
means is accomplished by the use of a regres- 
sion equation which is derived from the covari- 
ance analysis. 

Table XX lists the adjusted means for each 
class and for the various treatment groups. 
These means have had the effects of the respec- 
tive pre-test means eliminated from them and 
will thus stand alone for comparison with each 
other without reference to the pre-test means. 

The next step is compute t-tests for compar- 
isons of pairs of treatment means. The three 
t-tests results are shown in Table XXI. 

The t-tests reveal a clear-cut trend; the ex- 
perimental group has the lowest adjusted mean 
score, differing significantly from both control 
groups; Control, has a significantly lower mean 
than Control,. Returning to Table XX, we see 
that this trend holds true for all of the grade lev- 
els as well as for the treatment means. 

The main analysis is now complete. It will 
be discussed in the next section. 

Reliability of the CT—The test-retest relia- 
bility of the CT is .73. The remarks concern- 
ing the reliability of the PST also apply here. 

Discussion and Conclusions—There is little 
doubt that the classes of the experimental teach- 
ers showed a marked change on both measures 
when compared with the classes of the control 
teachers. The statistical difficulties—the lack 
of homogeneity of regression for the PST and 
the peculiar interaction effect for the CT-—-do 
not obviate the large experimental-control differ- 
ences. 

It is perhaps unfortunate that the covariance 
analysis was inapplicable to the PST. There is, 
however, a plausible explanation for the hetero- 
geneity of regression which precluded its use. 
Examination of Table VII shows that the post- 
test means for the experimental classes, espec- 
ially the fifth grade and the sixth grades, are 
perilously near the ceiling (i.e., the lowest pos- 
sible score) of the test. This means that con- 
siderable number of subjects obtained the same 
scores, mostly in the range 0-2. Since no one 
could improve beyond a score of zerc, many of 
the subjects whose pre-test scores varied 
achieved a common post-test score. This tends 
to attenuate the pre-post correlation. This was, 
of course, not true for the control groups. The 
net result was that the experimental classes 
showed different regressions than the controls. 
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The correlation between pre- and post-test 
scores was .74 for Control, subjects, .7l for 
Control, subjects, but only . 44 for the experi- 
mental group. 

Evidently the PST is an inadequate index for 
the pre-post type of experimental design since 
(a) the mean PST pre~scores are too low, and 
(b) the ceiling of the test is then too close to the 
pre-scores to permit adequate discrimination 
among members of experimental groups. 

The performance of the Control, classes is 
not easily evaluated. This group of teachers 
was made acquainted with various teaching aids 
used by the experimental teachers and was per- 
mitted to use any that they wished in any fashion 
and for any amount of time. Only the briefest 
instructions concerning use of the materials 
were given since it was felt that such instruc- 
tions fell within the province of the teacher train- 
ing program. The purpose of the inclusion of 
Control, was to attempt to determine the effects 
of using the teaching materials uninstructed as 
opposed to the effects of the training program. 
The results are somewhat ambiguous. Two of 
the three analyses show that the Control, sub- 
jects improved significantly more than the whole- 
ly untreated Control, though not as much as the 
experimental subjects. Individual t-tests based 
on adjusted CT scores and individual chi-squares 
based on the sign test of PST results reveal this 
trend. The t-tests derived from the analysis of 
variance of PST results do not show this trend. 

In an attempt to investigate this point further, 
the teachers in the experimental group and in 
Control, were asked to estimate the am ount of 
class time spent in the use of the teaching aids. 
The amount of time in hours for each teacher is 
shown in Table XXII. 

The experimental teachers used the teaching 
materials much more than did the control teach- 
ers, as would be expected. The experimental 
teachers also varied only slightly among the m- 
selves, as evidenced by the average deviation 
from the mean of 1.5 and the mean of 34 hours. 
On the other hand, the control teachers varied 
considerably, the average deviation being 4. 5 
and the mean 7 hours. 

From the data in Table VIII we saw that there 
was no interaction between treatments and grades 
on the PST although both effects were significant. 
The same conclusions apply to the CT (Table 
XIX) when the sixth grade (II) level was elimin- 
ated. The lack of interaction means that the dif- 
ferences between experimental and Control, clas- 
ses were about the same from grade level to 
grade level despite the significant overall differ- 
ences in grade level scores. The difference 
between the experimental fourth grade class 
and the Control, fourth grade class is about the 
same as the difference between the two fifth 
grade classes, and so on. 
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TABLE XVII 


ANALYSIS OF VARIANCE OF POST-TEST SCORES ON THE CT WITH 
GRADE LEVEL 4 (II) ELIMINATED 


F 
Int. Within Int. 


11.642 <.001 < .025 


3.146 <.001 


TxG 
Within Cells 
Total 


TABLE XVIII 
COMPUTATION OF THE POST-TEST INTERACTION ON THE CT WITH GRADE 


LEVEL 6 (II) ELIMIN ATED, ADJUSTED BY COVARIANCE 


Source d.f. 8s MS F 
TxG 4 35. 13 8.738 


Within Cells 170* 1502. 25 8. 837 
*One degree of freedom lost due to adjustment 


TABLE XIx 


ANALYSIS OF COVARIANCE OF THE POST-TEST CT SCORES WITH 
GRADE LEVEL 6 (II) ELIMINATED 


Treatments 953. 21 
Grades 71.12 
TxG 4 35.13 
Within Cells 170* 1502. 25 
*One degree of freedom lost due to adjustment 
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Treatments 2 1341. 72 670. 860 45.243 ee 
Grades 2 362.53 181. 265 12. 225 ee 
4 >. 20 
Source d.f. ss MS P 
476. 605 53.933 <.0001 
35. 560 4.024 . 02 
8. 738 0, 994 >.20 
8. 837 
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TABLE XxX 


POST-TEST MEAN SCORES ON THE CT, 
ADJUSTED BY COVARIANCE 


Experimental Control, Control, 


5. 192 10. 223 11. 076 
4.572 7, 457 10. 854 
5. 634 9.714 10. 917 
5. 132 9.131 10. 951 


TABLE XXI 


COMPARISONS OF TREATMENT GROUPS ON THE CT ADJUSTED 
POST-TEST SCORES 


Comparison 


Experimental - Control, 


Experimental - Control, 


Control, - Control, 


TABLE XXII 
CLASSROOM HOURS SPENT USING TEACHING AIDS 


Sixth Sixth Average 
Group Fourth Fifth (I) (1) Mean Deviation 


Experimental 33 33 37 33 34 1.5 


Control, 14 2 9 3 7 4.5 
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Grade 
Fourth 
Fifth 
Sixth 
Total 
t P 
| 7. 603 < . 0001 
9. 793 < 0001 
3.315 001 


December, 1955) 


The data in Table XXII show that the experi- 
mental classes were all treated approximately 
alike with respect to number of hours of use of 
teaching aids. The control classes, however, 
varied from 2 hours to 14 hours. If the use of 
teaching aids alone had any real effects, we 
would expect to have found a variation in differ- 
ences between pairs of classes. The two fourth 
grades, for example, should not differ as much 
as the two fifth grade classes since the fourth 
grade Control, teacher spent seven times as 
much time with teaching aids as did the fifth 
grade Control, teacher. This variation would 
have resulted in a significant interaction be- 
tween treatment and grades. Since no such in- 
teractions were found with the PST or with the 
CT after elimination of the sixth grade (II) level5, 
we conclude that amount of class time spent us- 
ing the teaching materials was probably not the 
sole factor making for reduction intest scores, 
even if we accept the conclusion that there was 
a significant change in the scores of the subjects 
in Control, classes. That we should accept this 
conclusion is still open to question, as is the 
matter of what factor did influence the scores 
of the subjects in Control, if we do not accept it. 
We have gone as far as we can reasonably go 
with the present analysis. Data are not avail- 
able to settle either question satisfactorily. 

Summary—The subjects of this investigation 
were four classroom teachers and their pupils, 
each classroom matched with twocontrol groups. 
The teachers participated in a training program 
designed to extend their understanding andappre- 
ciation of child behavior, to provide opportunity 
for growth in personal adjustment and to devel- 
op methods for teaching causally oriented c ur- 
ricular content. 

Tests of the child’s awareness of the complex 
multiple causative nature of human behavior and 
of his tendency to immediate punitiveness were 
administered to both experimental and control 
groups in the fall of the school year and again 
approximately 6-1/2 months later. 

Extended analyses of the results using both 
parametric and non-parametric methods, as the 
nature of the data indicated, were applied. 

The classes of the experimental teachers 
showed distinctly significant changes on the two 
neasures used when compared with classes of 
the control teachers. 

It thus appears that when we bring children 
of the upper elementary grade levels under the 
influence of causally oriented teachers teaching 
causal content we bring about significant differ- 
ences in the child’s growth in the aspects meas~ 
ured in this study. 

Additional differences between causally ori- 
ented and control subjects will be reported in 
other papers. 
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FOOTNOTES 


. AlLof the analyses of variance computed for 


this report are based on Lindquist’s pro- 
cedures (5). ‘‘Treatments’’ refers to the 
three primary groups, the experinrental 
and the two controls. ‘‘TxG’'’ is the 
treatment-by~-grades interaction. The 
‘‘within cells’’ mean square is the over- 
all standard error when other systematic 
differences have been eliminated. If the 
interaction is not significant, the mean 
square for within cells is the appropriate 
error term for testing the effects. Inthe 
remaining tables of this kind, ‘‘sum of 
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squares’’ will appear as SS and ‘‘mean 
square’’ as MS. 


2. The assumptions necessary for the use of co- 
variance will be found in (5). 


3. Strictly speaking the within cells MS is not 


the appropriate error term in Table XII, 
although it was used there to test the main 
effects. If the T x G MS, which is really 

the appropriate error term, had been used, 
the respective F-ratios for treatments and 
grades would be 0. 636 and 4.05, the latter 
falling just short of the . 05 level of signif- 
icance, Since the treatments MS simply 

remains insignificant and since we are not 


interested in grade differences, it is im- 
material which error term is used. 


4. A lengthy interview with the teacher by the 


experimenter most familiar with her 
(Whiteside) did not provide any clues as to 
what this uncontrolled factor might have 
been. 


5. The interaction did not actually involve the 


sixth grade (II) class in Control, but was 

rather a function of this class in Control,. 
Comparing only the experimental and Con- 
trol, groups, there was no significant in- 

teraction even with the sixth grade (II) 
class included. 
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THE SELECTION OF CANDIDATES FOR 
TEACHER EDUCATION AT THE 
UNIVERSITY OF WISCONSIN 


GUSTAVE JOHN STOELTING * 
Milwaukee Public Schools 


SECTION I 


BACKGROUND OF THE PRESENT 
INVESTIGATION 


A. Basic Principles in Screening of Candidates 
for Teaching 


MANY NEW screening procedures have 
come to be used by teacher training institu- 
tions during the last decade in an effort to se- 
lect more students capable of becoming super- 
ior teachers. This seems important if our 
schools are to have competent leadership. To- 
day the majority of teacher-education schools 
employ some form of screening of persons ad- 
mitted or educated as teachers. 

This strong interest in teacher selection 
arises out of a three-fold need: 


1. To maintain high standards in the pro- 
fession at a time when emergency measures to 
overcome teacher shortages may permit stand- 
ards to fall. 

2. To find more individuals equal to the 
increasingly complex task of teaching. 

3. To prevent the wastes created when in- 
dividuals are trained for positions for which 
they are personally or intellectually not quali- 
fied. 


The teacher shortages of the past twelve 
years have given rise to emergency measures 
that permit large numbers of poorly qualified in- 
dividuals to become teachers. While teac her 
training institutions and professional organiza- 
tions have attempted to overcome shortages with 
programs of recruitment, merely encouraging 
larger numbers to become teachers is notanad- 
equate solution to the problem. To have good 
teachers it is necessary to exercise a certain 
amount of selection from among the larger 
groups of interested individuals who choose to 
become teachers. The result has beena great- 
er variety of screening devices and their more 
widespread use. 

The literature on teacher selection repeat- 
edly stresses the importance of protecting the 
public welfare through careful selection of can- 
didates for teacher education. New concepts of 


learniy and development of young indi viduals 
combine to make classroom managementa highly 
intricate procedure. The necessity of helping 
young people understand a complex environment 
and the demands that it makes on the individual 
emphasize further the necessity for more com~- 
petent teachers. Individuals with specific qual- 
ities are frequently called to meet the require- 
ments of special situations. Stiles (30) sum- 
marizes this point as follows; 


Superficial consideration might lead 
one to believe that democratic princi-~ 
ples would compel institutions to admit 
all who desire to become candidates for 
teacher education. More objective 
thought, however, would help one to 
realize that since education is a func- 
tion of the state and is maintained for 
its own good, therefore, the state has 
not only the right but also the responsi- 
bility to secure the best possible teach- 
ers. Merely providing state institutions 
of higher learning with competent pro- 
fessors and adequate curricula will not 
assure the state that superior teachers 
will be developed. The type of teacher 
that the university or teachers college 
will ultimately produce is dependent up- 
on the quality of persons whoare accept- 
ed for training. 


Much of what has been done in construction 
of devices for screening of teacher candidates 
has been based on investigations of factors in- 
volved in successfulteaching. The factors most 
commonly used in screening teacher candidates 
are intelligence and scholastic achievement. La 
Duke (19), Rostker (26), and Seagoe (27) inde - 
pendently investigated the relationship between 
intelligence and success in teaching. They found 
a significant, positive relationship. As meas - 
ures of intelligence have become generally more 
reliable, the use of this device for screening of 
teacher candidates has become aimost universal. 

Lins (20) and Stuit (31) provide data on the 
relationship between scholastic achievement and 
success in teaching. Most teacher training in- 
stitutions today specify a minimum of scholas- 
tic achievement as a part of their program of 


othe anther wishes to express his appresiation to Dr. Liddle, Dre Eye, and Dre Ae Barr 
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for helpful criticisms and suggestions in the 
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screening for teacher candidates. 

In addition to using intelligence and scholas- 
tic achievement as measures for predicting fu- 
ture teaching success, some teacher training in- 
stitutions are using also less well known meas- 
ures of yet other factors in teaching success. 
F lanagan (11) and Seagoe (27) provide data sup- 
porting the use ofa general culture test for screen- 
ing prospective teachers. The importance of pro- 
ficiency in reading and speech as vital qualities 
of the successful teacher is supported by data 
provided by Flanagan (11), Henrickson (14), and 
McCoard (21). 

Personality as a factor in teacher success 
has stimulated much interest and lively discus- 
sion, The use of personality measures in teach- 
er selection is on the increase, More general 
use of such measures is limited because both sat- 
isfactory rating scales and standards of teacher 
personality are lacking. Some institutions now 
using this factor in a screening device do so only 
to discover instability ina candidate. Exper i- 
mental screenings of candidates using personal- 
ity devices are under study in a number of teach- 
er training institutions. 


B. General Features of Screening for Teacher 
Selection 


The screening devices described inthis sec- 
tion and their placement ina program of selec- 
tion represents a composite of practices as re- 
ported in the literature on teacher selection. 
There is much variation among teacher training 
institutions in this respect. 

Admission is the crucial point in most teach- 
er selection programs. This is true largely be- 
cause failure to predict the success of a candidate 
at the time of admission may lead to great waste. 
As a result, a large number of devices are in use 
by teacher training institutions to screen for ad- 
mission. Screening devices may be divided into 
two catagories: application and orientation. 

Screening through application generally is 
based on information on the educationa! and home 
background, The most frequently used data are 
the high school record, scholastic attainment, 
personality and attitude ratings of the applicant 
by school staff members, standardized test 
scores, and participation in extra-curricular ac- 
tivities. To complete the data on educational 
background the administrative head of the school 
from which the applicant is graduated is ordinar- 
ily requested to make some sort of a statement 
regarding the applicant's general acceptability 
for continued training. 

Sometimes the information on educational 
background is supplemented by a wide variety of 
information on home and personal background. 
Such questions as age, occupation, and educa- 
tional attainment of various members of the fam- 
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ily are asked, The applicant is also sometimes 
requested to furnish an autobiography, a report 
of financial responsibility, a statement of pur- 
poses for continuing his education, and personal 
references regarding his acceptability. 

Admission is usually followed by a period of 
intensive testing to aid in orientation. Some in- 
stitutions prefer to have the results of such test- 
ing in hand before the candidate’s application for 
admission is acted upon. While the latter ar- 
rangement has some obvious disadvantages, it 
does provide much additional data toassist in 
making the vital decision made at the time of ad- 
mission, 

Testing programs at the time of admission 
generally include such areas as English place- 
ment, general culture, intelligence, interests, 
personality, and reading. Within these areas 
there is wide variety of instruments used. 

The greatest variation in the selection and 
use of test instruments lies in the area of per- 
sonality where there appears to be little agree- 
ment on the qualities of a good teacher. Among 
the instruments frequently used are the Minne- 
sota Multiphasic Personality Inventory, the Bell 
Adjustment Inventory, and the Bernreuter Per- 
sonality Inventory. The use of a subjective tech- 
nique, the group interview, and projective tech- 
niques are being explored by several teacher 
training institutions. 

Two devices other than standardized tests 
are also commonly used in screening for teach- 
er selection; the physical examination ordinarily 
seeks to determine not only whether an individu- 
al is capable of undertaking a normal course of 
studies, but also if he has any physical defects 
which might limit his efficiency as a teacher 
and thus increase the element of risk involvedin 
accepting him as a candidate. 

Speech tests have a dual purpose. Theyare 
used to eliminate those applicants whose speech 
defects are such as to be a definite handicap in 
the profession. They also disclose remediable 
defects in applicants otherwise qualified. 

Applications for admission generally are re- 
viewed by an admissions official responsible for 
weighing the quality of each applicant as a stu- 
dent and prospective teacher in the light of all 
the evidence available. As screening procedures 
have become more refined a few teacher train- 
ing institutions have turned the important task of 
evaluating an applicant’s qualifications over toa 
committee. Such a move emphasizes the impor- 
tance attached to the evaluation of an individual’s 
potential success as a teacher when application 
is made for admission. 

A second important point in the teacher se- 
lection programs of many teacher training insti- 
tutions occurs at the end of the second year of 
preparation, or the beginning of the third year. 
Here it takes the form of ‘‘Admission to Profes- 
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sional Study’’ or ‘‘Admission to Senior College’, 
or it may simply constitute a more intensive 
stage in evaluation of the candidate’s qualifica- 
tions in a continuous screening process. By and 
large, no matter what the name or what proced- 
ures are used, at this point the candidate's pro- 
gress is examined as to his suitabilityasa teach- 
er. 

The most important new data commonly used 
at this point is the record of a candidate’s aca- 
demic achievement and the pattern of credits 
earned, Academic achievement as measured by 
the Grade Point Average or its equivalent is used 
by many schools to maintain minimum standards, 
Of all devices for screening reported in the liter- 
ature, the maintenance of minimum scholastic 
standards appears most frequently. 

To insure uniform training deemed essential 
for successful teachers many teacher training in- 
stitutions specify an academic pattern to be fol- 
lowed by their candidates. While this device does 
not provide specifically for screening, it again 
sets basic requirements which the institution 
deems necessary for teacher success. 

Some teacher training institutions employ 
personal interviews, a physical examination, and 
a speech test, at this point, to determine admis- 
sion to teacher training program rather than at 
the time of admission to the school. A few insti- 
tutions use these devices both at the time of ad- 
mission and after the first two years of training. 

As further evidence of a candidate's pro- 
gress in becoming a teacher some schools exam- 
ine the type of activities in which the candidate 
has engaged beyond the requirements of the cur- 
riculum. Emphasis is placed upon participation 
in those activities which appear to contribute to 
teacher success following graduation. The use 
of this device, as in the case of specific curric- 
ular requirements, does not necessarily c onsti- 
tute screening in that it selects the better candi- 
date and eliminates the poorer. It does, how- 
ever, set requirements which must be met, thus 
exposing the candidate to further experiences and 
training on which he may draw later as a well- 
quaiified teacher. 

The pattern of evaluation already discussed 
is often combined with evaluation of achievement 
in professional training courses and practice 
teaching. Thus a continuous process of evalua- 
tion covering the entire period of professional 
training is provided. 

In general, it would appear that much of the 
screening practices of the final two years of 
training are different from those that are com- 
monly employed at the time of original adm is - 
sion. Most teacher training schools do the 
large share of screening at the time of adm is- 
sion; the promising candidates are admitted to 
school and the poor risks are eliminated. There- 
after it would appear that the function of screen- 
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ing changes. If a candidate continues to make 
normal progress in meeting the increasing re - 
quirements of training and skill, the screening 
process ‘‘selects’’ him for further training. On 
the other hand, a candidate is eliminated who 
does not have sufficient interest to fulfill the 
screening requirements satisfactorily or if a de- 
fect serious enough to impair successful teach- 
ing is discovered, 

The final stage of screening by teacher train- 
ing schools is graduation, certification, and 
placement. Most schools of education combine 
graduation and certification, i.e., the school 
certifies the individual as a teacher when he grad- 
uates. Some few schools drawa finer distinction 
between graduation and certification, These 
schools will graduate a candidate, provided he 
has fulfilled all the requirements; he may have 
achieved only the minimum professional stand- 
ards required by law, but because he hasnot at- 
tained the standards set by the school, the can- 
didate is not certified. To gain certification at 
such schools candidates must attain higher qual- 
ifications. The institution, in turn, certifies to 
the professional competence of the individual. 

Placement is not generally regarded as a 
part of the screening program. However, teach- 
er training institutions are sensitive to its 
screening function, Screening controls are re- 
stricted or relaxed as opportunities for place- 
ment change. In addition, the use of screening 
devices indicates the desire by schools for teach- 
er education to reduce waste by eliminating those 
who might have difficulty in being placed, 


C. Survey of the Literature 


The literature on teacher selection may be 
divided into four areas: 


1. General Philosophy 


Teacher selection is based on the important 
principle that good schools depend on well quali- 
fied teachers. Such standards of competency in 
turn depend upon good training, a background of 
accepted research, and capable individuals, But 
good training can be had only as research find- 
ings in the field are put in practice. Onthis bas- 
is the literature recognizes that the better selec- 
tion of candidates is the key to higher profession~ 
al standards. 

It is further pointed out in the literature that 
better teachers are needed for the increasingly 
complex job of teaching. As the center of learn- 
ing moves away from the highly organized fields 
of subject matter toward the needs of the ind i- 
viduals, the task of guiding and directing the 
learning process places increasing demands up- 
on the teacher. In order to discharge such an 
important task adequately, more capable individ- 
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uals are needed. 

The literature also points to the necessity 
for reducing the human and social waste involved 
in training and employing individuals who are not 
well suited to the task of teaching. The develop- 
ment and use of good selection procedures is a 
means of avoiding such waste, while at the same 
time providing for more efficient use of the teach- 
er~training facilities. 

Comprehensive statements of basic philoso- 
phy regarding teacher selection are found in ar- 
ticles by Flowers (12); Kirkpatrick (16), and Mor- 
ris and Phillipson (23). The latter article is par- 
ticularly valuable in that it describes the areas 
of research necessary to develop more adequate 
teacher-selection procedures. 


2. General Surveys of Teacher Selection 
Procedures 


Studies on the prevailing practices inteach- 
er selection in limited areas of the United States 
have been reported by Haskew (13), and Stiles 
(30). Their conclusions may be summarized as 
follows: 


a. Most institutions have a teacher selection pro- 
gram. 

b. The programs range from simple require - 
ments to highly developed procedures in the 
process of being further refined. 

c. The element of timing in selection proced- 
ures varies from a single, ‘‘on the spot’’ se- 
lection made only once to a continuous selec- 
tion process beginning while still in high 
school and extending to graduation and place- 
ment, 

d. Selection is the responsibility of one person 
in most institutions. Some few have a com- 
mittee. 

e. Most common bases of selection are: 

Scholastic achievement 
High school record 
Results of aptitude tests 
Results of interviews 

{. Personality traits are being studied and used 
increasingly in teacher selection programs. 

g. Some teacher training institutions without a 
selection program feel that as public institu- 
tions they do not have the right to exclude. 

h. Most of the institutions without a selection 
program do not have one because of an appar- 
ent lack of reliable bases to make a selection. 


3. Descriptions of Existing Programs of Se- 
~ 


Reports of specific programs of teacher se- 
lection are common in the literature. Four of 
these reports bear mention here for they repre- 
sent the most advanced practices in teacher se- 
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lection. The selection program in the Connecti- 
cut Teacher's Colleges is described by Engle- 
man and Larson (10); the plan in use in New Jer- 
sey is reviewed by West (32); the San Diego State 
College selection program is described in an ar- 
ticle by Alcorn (1); and the development and op- 
eration of the teacher selection program at Syra- 
cuse University in New York is given by Smith 
(28), and White (33). The significant feature of 
each of these teacher selection plans is that they 
utilize subjective techniques of personality anal- 
ysis in addition to other more common sources 
of information to aid in making a judgment. 

4. Sum iterature 
Comprehensive reviews of the literature on 
teacher selection have been written by Archer (3), 
Barr (4), and Haskew (13). From these articles 

the following conclusions may be drawn: 


a. There is a lack of reliable objective data on 
which teacher selection may be based. 

b. The reviews report several studies to show 
significant correlations between success in 
teaching and scholarship. 

c. Increasing use is being made of tests = apti- 
tude for various special fields. 

d. Speech tests are becoming more common in 
programs of teacher selection. 

e. Greater emphasis is being placed on person- 
ality in teacher selection. Most of the teach- 
er training institutions using this factor e m- 
ploy it to detect the unstable. Some few insti- 
tutions report the use of experimental tec h- 
niques to select candidates who demonstrate 
desirable personality traits in a social situa- 
tion. 

{. Evidence of leadership qualities are becoming 
increasingly important as a part of teac her- 
selection programs. 

g. Tests of proficiency in basic skills are being 
used to supplement intelligence tests. 

h. Committee selection procedures are steadily 
replacing selection by a single individual. 

i. There is an increasing recognition that teach- 
er selection cannot depend on a single factor, 
but must be based ona constellation of factors. 


SECTION I 
STATEMENT OF THE PROBLEM 
A. Selection and Teacher Success 


THE CENTRAL problem of this study is to 
determine the efficiency with which the several 
selective devices employed at the University of 
Wisconsin operate in choosing potentially suc - 
cessful teachers out of the total group seeking 
admission and eventual certification for teach- 
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ing. To do this the study will seek to answer 
five questions; 


1. How well do present selection procedures dis- 
criminate between the superior teacher candi- 
date and the teacher candidate who is likely to 
meet with only limited success ? 

Under what circumstances do selection de- 
vices now employed permit admission of indi- 
viduals not likely to succeed? 

Is there basis for raising or lowering the stand- 
ards by which candidates are admitted to pre- 
service training and certification as teachers ? 

. At what point in the teacher education program 
is the screening for teacher education likely 
to be most effective ? 

. What recommendations, based on the findings 
of the study, can be made for improved pro- 
cedures for the selection of candidates for 
teaching ? 


To study the effectiveness of the screening 
devices used at the University of Wisconsin, the 
data on which selection of the 1952 graduating 
class of the School of Education was based will 
be related to success of the individuals of the 
class followtng graduation. These data include: 


a. Rank in high school class 
b. Psychological scores 
Henmon-Nelson 
American Council on Education 
c. Cooperative Reading test score 
d. Cooperative General Culture test score 
e. Predicted Grade Point average 
f. Earned Grade Point average 
g. Minnesota Multiphasic Personality Inventory 
test score 
h. Speech proficiency test score 


These data will be correlated to the various 
criteria for measuring success in teaching. The 
criteria will consist of: 


a. An in-service rating by the principal or super- 
intendent of those who were employed in a 
teaching situation during the year since grad- 
uation. 

b. A departmental rafing based on the estimate 
of a candidate's effectiveness as a teacher by 
the faculty of his major department. 

c. A Placement Bureau rating based on the can- 
didate’s general acceptability as a teacher. 

d, Practice-teaching grades. 


These ratings will first be considered separately 
after which they will be combined into a single 
rating for each individual included in the study. 
A measure of the efficiency of the selection 
procedures used by the School of Education will 
be obtained through correlating the scores used 
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in screening with the criteria of teaching suc - 
cess. By this means it will be possible to de - 
termine how well the screening devices can dis- 
criminate between teachers of superior, aver- 
age, and inferior teaching ability. The informa- 
tion gained through this study should offer a bas- 
is for improving the screening procedures in the 
School of Education of the University of Wiscon- 
sin, and also provide a means for continuous 
evaluation of the program. 


B. Selection of Candidates for Teacher Training 
at the University of Wisconsin 


The selection of candidates for teacher 
training at the University of Wisconsin has a du- 
al purpose; (1) to assure that all individuals who 
are accepted for training as teachers will suc- 
ceed, and (2) to assure that a larger proportion 
of those accepted for training are capable of be- 
coming superior teachers, Thus screening 
seeks to protect individuals from entering a field 
of work in which they may not succeed, while at 
the same time protecting our schools by supply- 
ing better teachers. 

A major point in the screening of candidates 
for teacher training at the University of Wiscon- 
sin occurs at the time of admission. The data 
on which admission is based includes personal 
data (physical characteristics, appearance, in- 
terests, ambitions), family data (nationality, 
parent's occupation, residence, siblings), edu- 
cational background (academic record, test rec- 
ord, pattern of credits earned, personality rat- 
ing, extra-curricular activities) and a statement 
by the administrator of the preparatory school 
regarding the educational promise of the individ- 
ual, 

The data is evaluated by an official in the 
Admissions Office at the University of Wiscon- 
sin. Greatest emphasis in the evaluation is 
placed upon the future academic promise of the 
applicant. Upon admission each student is as- 
signed to the school of his choice and an advisor 
in his major field. A student who expresses a 
preference for entering the School of Education 
is enrolled as ‘‘Pre Ed’’, and assigned to an ad- 
visor in the College of Letters and Science. 

During the week of registration new students 
participate in a program of orientation to life at 
the University. An important feature of this pro- 
gram is the extensive testing done during the per - 
iod. The tests included in the program are the 
Cooperative Reading Test, the Cooperative Gen- 
eral Culture Test, and the American Council on 
Education Psychological Examination. The re- 
sults derived from these tests are used to ad- 
vise and counsel the student during his first two 
years at the University. These test results al- 
so have an important function in the screening 
of teacher candidates at the time of admission 


120 JOURNAL OF EXPERIMENTAL EDUCATION 


to professional study. 

Following admission to the University, there 
is no direct screening of teacher candidates until 
the student applies for transfer to the School of 
Education at the end of the fourth semester of 
study. Two basic requirements must be metdur- 
ing the first four semesters to be admitted to pro- 
fessional study in the School of Education: (1) a 
student must have earned at least 62 credits of 
an approved course of study with a minimum i.3 
grade point average; and (2) the course of study 
a student presents for evaluation at the end of 
four semesters’ work must meet the standard re- 
quirements for majors and minors, specific 
course requirements, and requirements varying 
according to the major and minor departments. 

At the end of the fourth semester of study 
(or when 62 credits in an approved pattern have 
been earned) the student may apply for transfer 
to the School of Education for professional train- 
ing. Evaluation of a student’s record up to that 
point constitutes a second major point in the 
screening process. Data on which the screening 
is based includes a transcript of credits earned, 
grade point average, high school rank, and the 
results from the orientation tests taken during 
the registration period at the beginning of the 
first semester at the University. 

The most important factors in the screening 
are the two basic requirements for admission to 
the School of Education—completion of course 
requirements and maintenance of a 1.3 grade 
point average. 

Course requirements which must be com - 
pleted before an applicant may be admitted to pro- 
fessional study include: 


a. English attainment requirements 

b. Physical Education or Military Science 

c. Minimum requirements in majors and 
minors 

d. A minimum of 62 credits 


In addition each major department has vary- 
ing requirements which the individual must meet. 

In some cases when most requirements have 
been met and the candidate presents records 
otherwise suitable, he may be admitted on the 
condition that certain deficiencies will be removed 
during the following semester. In other cases 
where many requirements remain to be complet- 
ed, the candidate must utilize an additional sem- 
ester or summer session before application for 
transfer may be made. 

The record of credits earned is also evaluat- 
ed for grade point average (basedon 1 grade point 
per credit for a final course grade of “‘C’’, 2 
grade points per credit for a final course grade 
of ‘‘B"’, and 3 grade points per credit for a final 
grade of ‘‘A’'). A minimum total grade point 
average of 1.3 is specified for admission to the 
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School of Education. 

Candidates whose application for admission 
to teacher training is rejected on the basis of a 
grade point average too low to meet the minimum 
requirement may request to have his case re- 
viewed by the Dean of the School of Education, or 
an assistant. In sucha case, compensating fac- 
tors such as an above-average I.Q., or above 
average high school rank, are sought in the can- 
didate’s records, Such candidates whose rec 
ords are otherwise satisfactory may be admitted 
on a strict probationary basis. 

While data from the Cooperative General Cul- 
ture and Cooperative Reading tests are used as a 
part of the screening process, it does not play a 
part in a candidate’s admission to the School of 
Education. These data are used to aid the indi- 
vidual candidate and his advisor in plotting the 
most appropriate course of professional study 
based on his skills and interests. 

Following admission to professional study 
the student remains subject to course require- 
ments while maintaining the 1.3 grade point av- 
erage. Both of these devices continue to serve 
the screening function in that they eliminate those 
who cannot reach the minimum standards of suc- 
cess in teacher training. 

During training the candidates must meet 
three other screening situations to qualify for 
graduation and certification as a teacher. The 
first of these is a speech test which is adminis- 
tered jointly by the School of Education and the 
Department of Speech. Its purpose is to certify 
that the speech proficiency of the teacher candi- 
date is of a satisfactory standard for classroom 
work. Provision is made for remedial work for 
those who cannot qualify on the initial test. Occa- 
sionally this device may screen out such individ- 
uals whose speech handicaps are such as to limit 
their efficiency in the classroom. 

The Minnesota Multiphasic Personality In- 
ventory serves as a second screening device dur- 
ing the period of professional training of teach- 
ers. Use of the inventory is limited to the detec- 
tion of such individuals whose personality is un- 
Stable to the point of limiting their effectiveness 
in the classroom. Such individuals are referred 
to the Student Health Clinic for treatment and are 
counseled into other fields of work. 

Finally, a candidate must present a certifi- 
cate of physical health and fitness from the Uni- 
versity Medical Examiner as an indication that 
no physical defects exist to limit the individual’s 
success as a teacher. 

When the candidate has successfully met each 
of these screenings the School of Education is 
willing to certify his success asa teacher by 
granting the University Teacher's Certificate. 
Through the use of the screening devices as de- 
scribed only those whose success as teachers is 
reasonably assured are retained for training and 
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graduated with certification. 


SECTION III 
GATHERING THE DATA 
A. The Study Group 


AS A BASIS for study of the selection pro- 
cedures employed by the School of Education of 
the University of Wisconsin, the 1952 graduating 
classes (February, June and August) were chosen 
for study. Members of these classes have been 
teaching one or more years, thus giving an ade- 
quate basis for in-service success rating. 

The comvined membership of these three 
classes is 352; 134 were men and 218 were wom- 
en. A preliminary survey of the group made in 
October, 1953, disclosed that 54 men and 133 
women, or a total of 187, taught during the first 
year following graduation; a total of 165 did not 
teach, —-80 were men and 85 women. Of the 80 
men who did not teach, 30 were in military ser- 
vice; of the 85 women who did not teach 27 were 
married. 

The remaining non-teaching graduates may 
be grouped as follows: 


1. Attended graduate school—19 men, 9 
women 

2. Decided not to teach—12 women 

3. No record of employment, and no reply 
to two inquiries—13 men, 9 women 

4. Entered private industry—11 men, 12 
women 

5. Other public employment—4 men, 4 
women 

6. Unplaced—3 men, 9 women 


While this non-teaching group appears large, 
it is possible that many may eventually become 
teachers. Some of those now in service and 
others in the Graduate School will doubtless en- 
ter the profession later. Nevertheless consider- 
ing the totals involved, the non-teaching group 
appears large. 

Further study of the 1952 graduating group 
made it apparent that much of the data used for 
screening was not available for those who trans- 
ferred to the University of Wisconsin after a 
year or two of study elsewhere. These were not 
processed by the usual admission procedures, 
nor were the data of the orientation testing pro- 
gram available. Therefore, only those who had 
originally entered the University as freshmen 
and who had gone through the entire procedure of 
admission and screening were included in the 
study. Thus, 163 transfers to the University of 
Wisconsin were dropped from the study for lack 
of data, leaving 189 in the group to be studied. 
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The placement records for this group pro- 
vide the data presented in Tables I, I, and II. 


B. Methods Employed in Gathering the Data 


To facilitate gathering of the data, a special 
4'' x 6'' card was devised and printed for use in 
the study. One card was prepared for each indi- 
vidual. On the top line of the card beginning with 
the left margin the name of the individual was 
typed. The space immediately below the name 
was reserved for the date of entry into the Uni-~ 
versity and a notation whether the individual was 
an original entry or a transfer student. The up- 
per right hand corner was used to record the date 
of graduation and the individual's acadeniic ma~ 
jor and minors. The space below Criterion Rank 
was used to record the details of the individual's 
placement. 

Since all test data were filed according to 
date of entry, the transcript of each student's 
record was the logical starting point. Tran- 
scripts of the graduates were made available by 
the School of Education Dean's office. Inaddition 
to the date of entry the transcript also contained 
high school rank and earned grade point average 
data. Since rank in high school class hadalready 
been converted to a percentile score, these data 
were simply transferred to the record card. To 
compute the earned grade point average it was 
necessary to count the number of credits and 
grade points earned and to record them in frac- 
tion form to be calculated later. 

With the date of entry available, the gather- 
ing of test data could go ahead since this data 
was filed according to the student’s entry date. 
The test data included; 


1. Henmon-Nelson Psychological 

2. American Council on Education Psycho- 
logical 

3. Cooperative Reading 

4. Cooperative General Culture 


The information on these tests was made avail- 
able through the Student Counseling Center. 
Since the data for each of the tests were already 
in percentile rank form, the data were transfer- 
red directly to the individual record for each 
graduate in the study group. The Student Coun- 
seling Center also furnished data on each individ- 
ual’s predicted grade point average (based ona 
regression equation using high school rank and 
percentile rank from the American Council on 
Education Psychological examination to predict 
Grade Point Average). 

Inasmuch as only the raw scores were avail- 
able for the Minnesota Multiphasic Personality 
Inventory, it was necessary to complete a pro-. 
file and code for each member of the study group 
before the individual's score could be recorded. 
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TABLE I 


SUMMARY OF PLACEMENT OF THE 1952 STUDY GROUP OF THE 
SCHOOL OF EDUCATION AT THE 
UNIVERSITY OF WISCONSIN 
(Survey of October, 1952) 


Men employed in teaching positions 21 
Women employed in teaching positions 77 

98 
Men employed in non-teaching positions 42 
Women employed in non-teaching positions 49 

91 
Group Total 189 

TABLE ll 


SUMMARY OF PLACEMENT OF THE 1952 STUDY GROUP OF THE 
SCHOOL OF EDUCATION AT THE 
UNIVERSITY OF WISCONSIN 
IN TEACHING POSITIONS 
(Survey of October, 1952) 


Teaching Field Men Women Total 
Agriculture 2 2 
Art Education 2 5 7 
Business Education 1 1 
Chemistry 1 1 2 
Economics 1 1 
English 1 10 ll 
French 2 2 
Geography 1 1 
History 3 1 4 
Home Economics 26 26 
Mathematics 1 1 
Music 9 9 
Natural Science 2 1 3 
Physical Education 7 5 12 
Recreation 7 7 
Sociology 1 1 
Speech 1 1 
Speech Correction 1 6 7 
Total Men Teaching 21 

Total Women Teaching 77 


Total Graduates Teaching 
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TABLE Ill 


SUMMARY OF PLACEMENT OF THE 1952 STUDY GROUP OF THE 
SCHOOL OF EDUCATION AT THE 
UNIVERSITY OF WISCONSIN 
IN POSITIONS OTHER THAN TEACHING 
(Survey of October, 1952) 


Men Women Total 


Decided Not to Teach 

Graduate School (U of W) 
(U of Chicago) 

Married 

Military Service 

No Reply 

Other Public Employment 

Private Industry 

Unplaced 


wo 


Total Men Not Teaching 42 
Total Women Not Teaching 49 
Total Graduates Not Teaching 91 


TABLE IV 


CORRELATIONS OF FOUR CRITERIA OF TEACHING SUCCESS WITH TEST DATA 
EMPLOYED IN SCREENING CANDIDATES FOR TEACHER TRAINING 


Criteria of Teaching Success 


Placement Practice 
In-Service Departmental Bureau Teaching 
Screening Data Rating Rating Rating Grades 
Henmon-Nelson 
Psychological . 216 . 004 
ACE Psychological -.027 . 163 .073 026 
Reading . 056 . 240 . 106 . 059 
General Coop. Culture 
Social Problems -. 169 . 105 -. 060 
History ~. 245 .101 . 054 -.038 
Literature -. 549 . 087 -. 112 -.039 
Science -. 176 045 . 000 -. 061 


Fine Arts ~. 042 -. 161 -. 133 ~. 194 
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The data on the speech screening test were 
made available in the office of Professor Gladys 
Borchers, Chairman of the Education-S peech 
Committee. Ratings in their original form were: 
A ~ superior, B - above average, and C - aver- 
age (no student is certified with less thana “C”’ 
rating, and is assigned remedial work untila‘‘C’’ 
rating is earned). To give these ratings numer- 
ical sasis for statistical purposes, an ‘‘A’’ was 
recorded as ‘‘5"’, ‘‘B’’ as ‘‘4’’, and “‘C”’ as 

With the exception of the data from the Min- 
nesota Multiphasic Personality Inventory, all the 
data was ina form readily adaptable to use ina 
correlational study. The MMPI data were not 
amenable to such a study. 

During the time screening data was being 
gathered, the data on criteria of teaching success 
to which screening data will be related was also 
being recorded, The criteria of teaching suc - 
cess include: 


1. An in-service rating 

2. A departmental rating 

3. A Placement Bureau rating 
4. Practice teaching grades 


To obtain the in-service rating for the 98 
graduates of the study group who were employed 
as teachers during the first year following grad- 
uation, a postal reply rating card was devised 
and printed (see Appendix F).* The rating is 
based on the individual's performance in his first 
year in teaching. 

These cards were mailed to the superintend- 
ents or principals of 95 teachers in the group. 
Ratings for three of the teaching group who were 
employed as teachers of recreation by the Amer- 
ican Red Cross were not requested because of 
’ much shifting of assignments, and no current in- 
formation on what their present situation was ; 
furthermore these individuals were not assigned 
in one location long enough to give an acc urate 
in-service rating. Wherever possible, the re- 
quest for a rating was sent directly to the super- 
intendent or principal in charge. 

Within 14 days of the mailing date, 76 (80%) 
had been returned. At the end of 30 days, 88 
(93%) had been returned. Two of the remaining 
seven ior whom no rating was returned had not 
been placed as the survey of placement had indi- 
cated. The remaining five were placed out of the 
State of Wisconsin, and, lacking the name of their 
principal or superintendent, no further effort was 
made to obtain an in-service rating. 

The in-service ratings obtained through this 
means were recorded as ‘‘5'’ for a superior rat- 
ing, ‘‘4'' for an above-average rating, ‘‘3’’ for 
an average rating, ‘‘2'’ for a below-average rat- 
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ing, and ‘‘1”’’ for an inferior rating. By giving 
these ratings a numerical value it was possible 
to make various statistical analyses of them. 

A second criterion of teaching success c on- 
sists of a departmental rating. This rating is 
made by the faculty of each individual’s major 
department. The department’s rating is an esti- 
mate of the individual’s potentialities as a teach- 
er. Since a major department’s most important 
contact with the individual is through his class- 
work, the rating may reflect heavily the individ- 
ual’s academic achievement in his major sub- 
jects. 

The departmental ratings for the 1952 grad- 
uating classes were not uniform in the type of 
ratings employed. To produce as much uniform- 
ity between the departmental ratings as possible 
each department prepared a key for translating 
the scores into superior, above average, aver- 
age, below average, inferior ratings. These, as 
with the in-service ratings, were recorded as 
numerical quantities (‘‘5’’ for a superior rating, 
‘*4"’ for an above average rating, etc.,) for di- 
rect use in the computations. 

A Placement Bureau rating was the third 
criterion of teaching success. This rating, made 
by the Assistant Director of the Placement Bur- 
eau, depends on a group of factors not likely to 
appear in the other criteria ratings. The follow- 
ing factors were said to be involved in arriving 
ata rating: 


1. Credentials—statements of observing officials, 
advisors, teachers, and supervisors regard- 
ing the individual’s promise. Other informa- 
tion used here includes statements by the can- 
didate himself regarding his interests, pref- 
erences, and ambitions. 

2. Observations—appearance, attitudes and gen- 
eral adjustment of the individual is observed 
in a personal conference, in connection with 
his routine duties, and in social situations. 

3. Reviews of practice teaching performance by 

the critic teachers. 

Transcript is consulted for placement pur- 

poses only; it is not used for rating purposes. 

Grade point average is used for rating pur - 

poses only when very high or very low. 

5. The departmental rating is considered only 
when very high or very low. 


Ratings were provided on the superior, 
above average, average, below average, infer- 
ior scale. The ratings were recorded on the 
same numerical basis as the other ratings. 

Only 141 ratings could be provided by the 
Placement Bureau since 48 in the study group 
did not register with the Bureau. No special at- 


* All references to Appendices may be found in original thesis filed in the Library, University of Wis- 


consin, Madison, Wisconsin. 
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tempt was made to determine why these 48 did 
not register, but a simple survey of the records 
disclosed that a large proportion were those who 
decided not to teach, those women who were 
married and decided not to seek placement, and 
the graduates who majored in Recre ation and 
were placed through other placement facilities. 

A final criterion of teaching success con- 
sists of practice teaching grades. These grades 
are based on each individual’s attainment in two 
practice teaching situations—one semester of 
practice teaching in a minor academic field and 
one semester of practice teaching in the major 
academic field. These grades do not appear 
separately on the transcript but are available sep- 
arately in the Student Teaching records office. 
With separate major and minor practice teach- 
ing grades available for each individual, the 
grades were averaged to produce a single prac- 
tice teaching grade. 

Practice teaching grades are generally 
awarded on a superior, average, inferior basis 
using an A, B, C grading system. It was neces- 
sary, therefore, to assign the numerical values 
to these grades as follows: 


A (superior) 

A-, B+ (above average) 
B (average) 

B-, C+ (below average) 
C or below (inferior) 


With these numerical values the ratings will be 
used in the computations in the same way as the 
other criteria. 

A single, over-all criterion of teaching suc- 
cess was derived from an average of the four 
criteria described above. No weighting was giv- 
en the separate criteria: (1) since one individual 
was responsible for each of the ratings, itis felt 
that the judgment of any individual should not be 
emphasized more than the others; (2) with a 
straight average to produce the criterion, noone 
factor in teaching success is emphasized. It is 
felt that all the factors involved in arriving at 
the criteria ratings are contributory to teaching 
success, and should be considered equally. 

In computing the criterion, 80 of the total 
study group had all four of the criteria available. 
An additional 66 of the total group had three cri- 
teria available to formulate their criterion. For 
the remaining 43 of the total group only two cri- 
teria operated in arriving at their criterion of 
teaching success. 

To avoid a marginal criterion of teac hing 
success ratings for seven individuals, it was 
necessary to give emphasis to a single rating. 
Wherever these occur emphasis was given in 
the direction of the in-service rating, if avail- 
able; to the Placement Bureau rating if the in- 
service rating was missing; or to the practice 
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teaching grades if both the in-service ratings and 
the Placenient Bureau ratings were notavailable. 

These criteria of teaching success were 
considered separately in correlation studies with 
the screening data, and then, combined into a 
criterion of teaching success, were correlated 
again with the screening data. 


SECTION IV 
ANALYSIS OF THE DATA 
A. The Criteria of Teaching Success 


TO GET further data relative to the criter- 
ia of teaching success, intercorrelations were 
calculated among them. 

The correlation between the in-service rat- 
ings and the departmental ratings, based on 88 
cases for whom in-service ratings were avail- 
able, was .319. 

It is entirely probaole that these ratings have 
only academic ability as acommonelement. The 
department faculties were decidedly limited in 
the aspects of teaching upon which their esti- 
mates could be based. Academic ability was 
the one aspect with which these individuals were 
most familiar, The in-service ratings depend- 
ed upon this and other qualities as well. 

While there is little relationship between 
these data it is felt that both areas covered by 
the ratings are of importance in the training of 
a teacher as well as in success in teaching. 

The highest correlation between any twocri- 
teria of teaching success was .627 for 64 cases 
based on the in-service and Placement Bureau 
ratings. A strong similarity in what the ratings 
attempt to measure doubtless accounts for the 
relatively high correlation. In both ratings the 
academic record is consulted, but not empha- 
sized. Furthermore, in both the ratings, per- 
sonality becomes a matter of considerable im ~- 
portance. Such matters as adjustment in social 
situations, interest in people, attitudes toward 
community responsibilities, and general person- 
al appearance are important factors in both the 
in-service and the Placement Bureau ratings. 

The relationship between the in-service rat- 
ings and practice teaching grades, based on 68 
cases, was .327, which is low. It is interesting 
that there should be so little in common in the 
two ratings. It is possible that a teacher's abil- 
ity to organize and discharge a set of duties com- 
prising an actual teaching position is different 
from that provided by practice teaching. It would 
seem on the basis of the low correlation here 
derived that a study should be made of the fac - 
tors producing such a result. 

The correlation involving 141 cases between 
the departmental ratings and Placement Bureau 
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ratings was . 472. 

While both ratings make use of academic 
achievement as part of the rating, the Place- 
ment Bureau would appear to have recognized 
the importance of the individual's personality in 
teaching. 

Further differences in the ratings empha- 
size the supplementary character of the ratings. 
An individual's ability to adapt himself toa spec- 
ific job, to a school organization, and to a com- 
munity is of much importance in the Placement 
Bureau rating, while the departmental rating is 
not so much concerned with this factor. A third 
important difference in the ratings concerns the 
type of performance each is concerned with; 
the Placement Bureau rating is alert toperform- 
ance in leadership and organization of social 
services while the departmental rating depends 
largely on academic performance. 

A correlation of .551 involving 189 cases 
indicates considerable similarity between the 
departmental rating and practice teaching grades. 
It appears likely that both of these ratings de- 
pend heavily on academic achievement. 

It appears that the relatively high correla- 
tion of departmental-practice teaching ratings 
may offer some clue to the inability of practice 
teaching grades to predict in-service success. 
Greater similarities occur between practice 
teaching grades and departmental ratings than 
practice teaching grades and in-service ratings. 
It is likely, then, that greater emphasis in prac- 
tice teaching grades is being placed on the aca- 
demic aspects of teacher preparation rather than 
leadership and organizational factors consider- 
ed necessary to succeed on the job, 

The Placement Bureau rating serves as a 
transitional rating between training for teaching 
and actual teaching in the field. This rating cor- 
relates well with measures of in-service suc - 
cess, emphasizing the practical aspects of teach- 
ing, and also correlates well with measures of 
teacher success taken during preparation for 
teaching, emphasizing the theoretical and aca- 
demic aspects of teaching. A measure of teach- 
ing success involving the factors used in the Place- 
ment Bureau rating warrants further investiga- 
tion for possible adaptation to pre-training se- 
lection purposes. 

The value of the departmental ratings and 
practice teaching grades seems to be low for 
purposes of predicting in-service success. 
These ratings probably emphasize the academic 
and theoretical factors in teacher training as a 
basis for measuring teacher success. Asare- 
sult of the emphasis, however, they reflect 
teacher success in training, and justify their 
use as criteria of success. 

Since all of the elements used to arrive at 
the criteria ratings are of importance in some 
phase of teacher preparation and teaching, and 
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teacher selection needs be concerned with all 
these aspects, a straight average of the criter- 
ia is used to produce a composite criterion of 
teacher success. It is assumed that this criter- 
ion of teacher success will reflect all these im- 
portant elements in weighing the ability of a 
screening device to discriminate between levels 
of teaching ability. 

In an effort to get further data on the inter- 
relationships of the criteria of teacher success 
a multiple R was calculated using the depart- 
mental rating, Placement Bureau rating, and 
practice teaching grades to predict in-service 
success. The R was .629. Comparison of this 
figure with the intercorrelation figures on the 
teaching success criteria will show that the three 
ratings used together to predict in-service suc- 
cess are no better than the rating used by the 
Placement Bureau alone. It further indicates 
that what is being measured in the departmental 
ratings and practice teaching grades has no par- 
ticularly significant relationship to in-service 
teaching success. 


B. Correlations of Screening Data With the 
Criteria 


The data used for screening may conven- 
iently be divided into three groups; standardized 
test data, academic achievement data, and 
speech proficiency test data. Table IVshows 
the correlations of the standardized test data 
with the four criteria of teaching success. Table 
V shows the correlations of the same test data 
with a criterion of teaching success. 

Examination of the correlations on Table IV 
and V, the relationship between standardized 
test data and the criteria of teaching success, 
discloses that only two correlations in both tables 
are very different from zero. The firstof these, 
namely, the correlation between the Henmon- 
Nelson psychological scores and in-service suc- 
cess, is quite low. The other correlation, the 
-.549 between literature scores and in-service 
success ratings, would not generally be accept- 
able as evidence of teacher acceptability. 

It must not be assumed that the evidence 
given in Tables IV and V is proof that the areas 
covered by the tests are not important insuccess- 
ful teaching. Even the least well prepared may 
appear adequate to rating officials, —thus little 
or no correlation. Then, too, these particular 
instruments possibly cannot be relied upon to 
screen teacher candidates. Other instruments 
in the Same areas may be able to perform the 
screening function adequately where the instru- 
ments under study here have failed. 

Reference to Table VI will show that data 
on academic achievement is promising for use 
in screening. While the correlations are not 
high, the data can be used with a reasonable de- 
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TABLE V 


CORRELATIONS OF A CRITERION OF TEACHING SUCCESS 
WITH TEST DATA EMPLOYED IN SCREENING CANDI- 
DATES FOR TEACHER TRAINING 


Criterion of 
Screening Data Teaching Success 


Henmon-Nelson 
Psychological . 139 


ACE Psychological 
Reading 


General Cooperative Culture 
Social Problems 


History 


TABLE VI 


CORRELATIONS OF FOUR CRITERIA OF TEACHING SUCCESS WITH 
ACADEMIC ACHIEVEMENT DATA EMPLOYED IN SCREENING 
CANDIDATES FOR TEACHER TRAINING 


Placement Practice 
In-Service Departmental Bureau Teaching 
Screening Data Rating Rating Rating Grades 


High School Rank - 221 . 205 199 237 


Predicted Grade Point 
Average* . 047 . 309 - 166 - 115 


Earned Grade Point 
Average (4 semesters)* . 385 . 302 . 375 


*Correlation between Predicted and Earned Grade Point Average = . 570. 
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gree of validity. 

It will be noted that the correlations of High 
School Rank with the various criteria of teach- 
ing success are low; when High School Rank is 
correlated with the composite criterion of teach- 
ing success, the correlation becomes worthy of 
consideration. It is doubtful, however, if ther 
.270 is high enough to justify the use of High 
School Rank as a screening instrument. Possi- 
bly its use may be justified if the severe limita- 
tions imposed by the low validity are observed. 

Predicted Grate Point Average does not ap- 
pear to qualify for use as a screening device. 
Only the correlation with the departmental rat- 
ing is considerable. Its correlation with Earned 
Grade Point Average is .570, which is always 
a consideration in the training of teachers. 

The Earned Grade Point Average has cor- 
relations with the criteria which are somewhat 
higher, more consistent and probably more use- 
ful as a screening instrument. 

Correlation with the criterion is higher 
than with each of the criteria separately, indi- 
cating that the Earned Grade Point Average can 
predict moderately well over a wide range of 
measures of teaching success. 

Since Earned Grade Point Average is the 
one screening device capable of discriminating 
between levels of teaching ability, its use might 
possibly be broadened to include other devices. 
Earned Grade Point Average is now used as a 
basis for admission to professional study in 
the School of Education, a 1.3 grade point aver- 
age being the minimum. This might well be car- 
ried on through the final two years of prepara- 
tion. A separate requirement might be set up 
to apply to professional courses. A minimum 
required 1.8 grade point average, for example, 
could be used to screen teacher candidates for 
higher professional standards, while a1.3 grade 
point minimum could remain for all other courses. 
Such adaptations could broaden the use of the 
only valid screening included in this study. 

Reference to Table VIII above will show 
that the Speech Proficiency test is not capable 
of predicting teacher success, there being a 
near zero relationship between speech scores 
and success in teaching. This certainly 
does not mean, however, that the speech test no 
longer serves an important function. The low 
correlations probably arises out of the fact that 
extreme cases have been removed from the 
teacher preparation program or that the defic- 
iency has been overcome. Those that meet this 
standard seem adequate. 

The use of the speech proficiency test as a 
screening device probably should be continued 
to insure minimum speech proficiency. As such, 
it will rot be necessary to rate the individual, 
but merely to certify that he meets minimum 
standards, or to withhoid certification until he 
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becomes qualified through remedial work. 

In order to further describe the efficiency 
with which the various selective devices operate 
the correlations between the selective devices 
and the various criteria were converted into an 
efficiency score through the use of the Predic- 
tive Efficiency formula. In this formula, Pre. 
Eff. = 1 - Vi-r? where V1-r” is the Coefficient 
of Alienation. The coefficient gives a basis for 
decinding how high a correlation must be in ord- 
er to be satisfactory for predictive purposes(24: 
115). This subtracted from 1 gives a decimal 
fraction which can be treated as a percentage. 

A predictive efficiency percentage above 90 is 
regarded as high, between 10 and 90 as moder- 
ate, between 5 and 10 as low, and below 5 as 

negligible. 

The only correlations whose predictive ef- 
ficiency was better than 5% were between the 
earned grade point average and the criteria of 
teaching success. The correlation between 
earned grade point average and the criterion of 
teaching success yielded a 9% predictive effic- 
lency; between earned grade point average and 
the in-service rating, 8% predictive efficiency; 
between earned grade point average and the de- 
partmental rating, 6% predictive efficiency; and 
between earned grade point average and practice 
teaching grades, 7% predictive efficiency. All 
other predictive efficiency scores were less than 
5}, thus not reliable for predictive purposes. 


C. The Minnesota Multiphasic Personality In- 
ventory and Teaching Success 


The Minnesota Multiphasic Personality In- 
ventory was included as a part of the screening 
program by the School of Educatioh in order to 
detect individuals with personalities such as to 
limit their effectiveness in the classroom. The 
data used in this study concerns only those whose 
code score met the standards considered to be 
adequate for teaching. Accordingly, with their 
elimination those remaining should be adequate | 
as shown by subsequent results. 

The data, when classified according to the 
categories of teaching success (namely, Super- 
ior, Above Average, Average, Below Average, 
and Inferior, based on the criterion of teaching 
success), yielded no discernable personality pat- 
terns. Personality codes which appeared among 
teachers judged to be inferior were found in 
equal or greater proportion among the other cat- 
egories of teaching success. Furthermore, per- 
sonality codes indicating a mild maladjustment 
appeared as frequently among the average, 
above average, and superior teachers as was 
the case among the below average or inferior. 

Further investigation in the use of this in- 
strument is possible, but beyond the scope of 
the present investigation. The responses on the 
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TABLE VII 


CORRELATIONS OF A CRITERION OF TEACHING SUCCESS 
WITH ACADEMIC ACHIEVEMENT DATA EMPLOYED IN 
SCREENING CANDIDATES FOR TEACHER TRAINING 


Criterion of 
Screening Data Teaching Success 


High School Rank . 270 
Predicted Grade Point Average . 207 


Earned Grade Point Average 
(4 Semesters) . 407 


TABLE VII 


CORRELATIONS OF A SPEECH PROFICIENCY TEST 
USED IN SCREENING CANDIDATES FOR TEACH- 
ER TRAINING WITH FOUR CRITERIA AND A 
CRITERION OF TEACHING SUCCESS 


Criterion of 
Screening Data Teaching Success 


Speech .179 
Speech—lIn-service Rating . O11 
Speech—Departmental Rating . 253 
Speech—Placement Bureau Rating . 069 
Speech—Practice Teaching Grades 


Speech—Composite Criterion of 
Teaching Success 
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test may be analyzed item by item to determine 
what items, if any, are able to discriminate be- 
tween different levels of teacher success. Thus, 
while the test as a whole is not a valid screening 
instrument, separate items within the test may 
be found entirely valid for use in screening. The 
data collected on the candidates for teacher train- 
ing through the Minnesota Multiphasic Personal- 
ity Inventory indicates that this test is incapable 
gf predicting teacher success. 


D. Conclusions of the Study 


On the basis of the data described in the 
foregoing pages, answers are proposed to the 
basic questions involved in this study: 


1. How well do present selection pro- 
cedures discriminate between the 
superior teacher and the teacher 
likely to meet with only limited suc- 
cess? 


On the basis of present selection procedures 
none of the standardized tests used appear cap- 
able of predicting future teacher success. These 
include the Henmon-Nelson psychological test, 
the American Council on Education ps yc holog- 
ical test, the Cooperative Reading test, and the 
Cooperative General Culture test. Since the re- 
lation between scores earned on these tests, and 
eventual success in the profession are as low as 
they are, these tests would appear to eliminate 
both potentially successful teachers as well as 
unsuccessful, 

Academic achievement data holds some 
promise for screening of teacher candidates, and 
the standards might be increased in this respect, 
but this will need to be done with care. Earned 
grade point average appears to be the most use- 
ful instrument in this group, and in the entire 
screening program for that matter, for predict- 
ing teacher success. As has been suggested 
earlier, the use of the overall grade point aver- 
age may be broadened to include other devices 
for screening, in addition to raising or lowering 
the minimum as the occasion demands. 

The use of High School Rank for screening 
purposes as far as the data here presented ap- 
pears of doubtful value. Although the correla~- 
tion of this device is larger than most, itappears 
low for predictive purposes particularly after a 
preliminary selection has been made onthe basis 
of grade point average. Its use should probably 
be restricted to that of providing supplementary 
data. 

The use of the Speech Proficiency Test 
should be continued in the screening program, at 
least for certification. It is important that teach- 
er candidates be certified for minimum speech 
attainment necessary for classroom success. 
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It is probably not necessary to rate candidates 
above the minimum requirements. 


2. Under what circumstances do se- 
lection devices now employed per- 
mit admission of individuals not 
likely to succeed? 


Of the 189 total group studied, only 24 were 
judged on the basis of the criterion to have 
achieved less than average success. This group 
was obviously permitted to enter training and be- 
come teachers in spite of the screening proced- 
ures employed. Since no follow-up, other than 
the in-service rating, was conducted on the group 
it is not possible to determine why these 24 met 
with limited success. 

Of the total below-average group seven were 
admitted to professional study with earned grade 
point averages well below 1.3. An additional 12 
were admitted with grade point averages between 
1.30 and 1.50. These two sub-groups constitute 
79%} of the total below-average group, indicating 
that an academic basis exists for their low rat- 
ing. 

However, 4 in the below-average group had 
earned grade point averages above 2.00. Thus, 
it appears that while the earned grade point av- 
erage is a valid measure of teaching success, it 
is not sufficient in and of itself. Further, exper- 
imental study will be necessary to discover other 
valid factors in teaching success to be combined 
with Earned Grade Point Average in animproved 
screening program, capable of isolating those 
individuals not likely to succeed in teaching. 


3. Is there basis for raising or lower- 
ing the standards by which candi- 
dates are admitted to pre-service 
training and certification as teach- 
ers? 


If the minimum grade point average were 
increased from the 1.30 now being employed to 
1.50, the higher minimum would screen out 13 
of the 24 who were judged to be of less than av- 
erage teaching ability. But at the same time, 
such an increase would eliminate 31 who were 
rated as average, 7 rated above average, and 1 
rated superior. Thus, it becomes evident that 
change of the grade point average minimum will 
not alone be the solution to more adequate screen- 


ing. 


4. At what point in the teacher educa- 
tion program is the screening for 
teacher education likely to be most 
effective ? 


It is apparent that prediction of teacher suc- 
cess becomes easier and more accurate as more 


December, 1955) 


information adout the candidate becomes avai!- 
able. The most accurate predictioncan be made 
at the time of graduation and certification, based 
on a 22% predictive efficiency of the Placement 
Bureau ratings. But since it is important for 
the efficient use of time and facilities to make a 
prediction of success as early as possible, a bal- 
ance must de effected. Thus, the ideal time 
occurs when the decision to admit or reject is 
made early enough to allow a rejectee a mple 
time to choose a new course without a great loss 
of credits, and late enough to determine the 
earned grade point average on which a reason- 
able judgment on future success in teaching may 
be based. 

There is no decisive evidence on which se- 
lection of a point of most effective screening 
may be based. It is possible that the time of 
application for admission to professional study 
may be most effective, subject to further study. 

It must be pointed out here that no adequate 
screening program will function properly with 
only one on-the-spot screening. Such screening 
must be supplemented by continuous selection 
procedures both before and after admission to 
professional study. These would include active 
supervision of academic progress and the course 
of study, periodic counseling, a speech pr ofic- 
iency test, a personality test, a physical exam- 
ination, interview, and standardized tests. A 
program such as this would be effective because 
it allows time to gather adequate information 
about an individual on which to base admission, 
and at the same time providing for increasing 
standards of attainment necessary for well-qual- 
ified teachers. 


5. What recommendations, based on 
the findings of the study, can be 
made for improved procedures for 
the selection of candidates for 
teaching ? 


a. It is entirely possible that standardized 
tests-are now available which might be used in 
the screening program to replace those not now 
competent for use in screening. The literature 
on screening of candidates for teacher training 
gives evidence of many standardized devices now 
in use, though there is no conclusive proof of 
their relation to teacher success. 

b. The use of the Grade Point Average may 
be broadened to include a specific minimum for 
professional courses. Since Grade Point Aver- 
age was demonstrated as an effective screening 
device, standards of more intensive preparation 
may be possible with a device such as this. 

c. The use of subjective techniques might 
be adapted to use for screening purposes. Tech- 
niques such as the group interview, group dyn- 
amics situations, and observation under social 


STOELTING 


131 


pressure are now in use and under study bya 
number of teacher training institutions. While 
their use seems promising, continuous research 
to check on the results is necessary before they 
can be depended upon. 

d. As an aid in the study of experience and 
personal characteristics ina teacher candidate, 
a record system following the individual through 
his four years of preparation for teaching may be 
useful. In teacher training institutions now using 
this device on an experimental basis they find 
that observations going back into secondary 
school make significant contributions in the 
screening of successful teacher candidates, 

e. Further research on the characteristics 
of the successful teacher are needed on how per- 
sonal cultural pattern, philosophy, and system 
of values combine in a successf{wl teacher. 
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DIFFERENTIAL METHODS OF SOLVING SE- 
LECTED PROBLEMS ON THE ACE PSY- 
CHOLOGICAL EXAMINATION’ 


LEONE ANDERSON, RICHARD RANKIN, JOY RICHARDSON, 
JULIUS SASSENRATH, JULIUS THOMAS 
University of Caliiovnia at Berkeley 


TWO EXPERIMENTERS using eye movement 
records have attempted to uncover differential 
problem solving methods employed by highand 
low performers. Anselmo (2), using Number 
Series problems, and Greening (4), emp1o ying 
Figure Analogies problems, found that high per- 
formers took less time than low performers in 
the solution of the problems. Both experiment- 
ers concluded that the average duration of fixa- 
tion pauses was slightly less for high perform- 
ers, but the difference in methods of attacking 
the problems remained undetected. In aneffort 
to improve and expand these earlier investiga- 
tions, an attempt is made to analyze more thor- 
oughly the eye movement data, and also to ob- 
tain verbal recordings of subjects during their 
solution of the problems. 


Problem 


The experimenters seek to disclose the dif- 
ferential problem solving processes employed 
by high and low performers, respectively, inthe 
solution of Number Series and Figure Analogies 
problems from the ACE Psychological Examina- 
tion (i). The following specific questions were 
investigated: 

a. Do high performers exhibit fewer fixa - 
tions and regressions than low performers? 

b. Do high performers have a total duration 
of fixations and regressions, respectively, which 
is less than the low performers? 

c. Do high performers fixate more on the pat- 
tern of the problem than on the options; i.e., do 
they éstablish the pattern before selecting an op- 
tion as an answer, whereas do low performers 
have an equal number of fixations on the pattern 
and options of the problem? 

d. Do high performers on Number Series 
problems shift more readily than low perform- 
ers from one arithmetic process required to 
solve a problem to another process for a subse- 
quent problem; i.e., shifting from addition toa 
combination of addition, multiplication, and di - 
vision? 


* This experiment was conivcted under the 


Methodology 


The ACE Psychological Examination, in ac- 
cordance with the instructions (1) was adminis- 
tered to two classes in Introductory Educational 
Psychology. From this population of 220 stu - 
dents those who missed less than six or more 
than 19 of the 30 problems on the ACE Number 
Series Sub-test were defined as the high and low 
performers, respectively, on Number Series. 
Those students who missed less than eight or 
more than 19 of the 30 problems on the ACE Fig- 
ure Analogies Sub-test were defined as the high 
and low performers, respectively, on Figure 
Analogies. This method of selecting extreme 
performers ultimately provided the folluwing 
number of subjects in the four groups: low Num- 
ber Series (N = 5); high Number Series (N = 9); 
low Figure Analogies (N = 6); and high Figure 
Analogies (N = 11). 

Eye movements were photographed by a cor- 
neal reflection type camera. The developed film 
was projected on a replication of the problems 
and the data were then tabulated. A detailed de- 
scription of the camera and procedure is given 
by Gilbert (3). 

A disc audograph** was used to record the 
subjects’ verbalizations of the step by step pro- 
gression of what they perceived and thought in 
their attempt to solve the problems. Data were 
compiled from the verbal recordings by develop- 
ing a rating scale with dichotomous ratings 
(+ or -). 

The following 11 items comprised the rating 
scale used in evaluating verbal responses to the 
Figure Analogies problems: 

. Identifies similarities only 

. Identifies differences only 

. Identifies similarities and differences 

. Uses mathematical concepts 

. Proceeds from an incomplete and/or inac- 
curate recognition of constants (those as- 
pects of the figure which remain the same) 
and variables (those aspects of the figure 
which change) 


of Professor Luther C. Gilbert, in the Educational 


Psycholozy Laboratory at the University of California, Berkeley. 


«Gray Audograph, Hartford, Connecticut. 
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. Develops an answer only through examina- 
tion of the rule; then proceeds to options 
(Solution by Analysis) 

. Develops an answer by elimination of op- 
tions (Solution by Elimination) 

. Does not apply relationship once having de- 
scribed it 

. Answers with correct solution, and (a) cor- 
rects error, (b) does not correct error, 
(c) makes no error 

. Answers with incorrect solution, although 
(a) corrects error, (b) does not correct 
error 

11. Presents no solution 


The following 13 items comprised the rating 
scale used in evaluating verbal responses to the 
Number Series problems: 

1. Identifies numbers only 

2. Identifies arithmetic relationship only 

3. Identifies numbers and arithmetic relation- 

ships 

4. Proceeds by insightful technique; i. e., 

notes rules after verbalizing only one to 
four numbers and/or relationships; then 
answers 

. Proceeds by repetitious technique; i.e. , 
(a) notes five to seven numbers and/or re- 
lationships; then answers, (b) re-e xam - 
ines the numbers and/or relationships al- 
ready verbalized; then answers 

. Develops an answer only through examina- 
tion of the pattern; then proceeds to op- 
tions (Solution by Analysis) 

. Develops an answer by elimination of op- 
tions (Solution by Elimination) 

. On first examination proceeds from an in- 
complete and/or inaccurate recognition of 
the rule 

. On first examination notes new arithmetic 
relationships not present in preceding prob- 
lems 

. Requires more than one examination to 
note new arithmetic relationships not pres- 
ent in preceding problems 

11. Answers with correct solution and (a) cor- 
rects error, (b) does not correct error, 
(c) makes no error 

12. Answers with incorrect solution, although 
(a) corrects error, (b) does not correct er- 
ror 

13. Presents no solution 


The problems to be used with the eye move- 
ment camera and verbal recorder were chosen 
from different editions of the ACE in order to 
minimize the effect of practice. The nine prob- 
lems to be solved before the camera were select- 
ed by an item-difficulty analysis of the Number 
Series and Figure Analogies Sub-tests of the 
1941 edition of the ACE. Assuming that the 
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same problems in the 1940 edition of the ACE 
had the respective degree of difficulty, problems 
were selected from this edition for use with the 
verbal recorder. For purposes of analysis the 
individual problems were divided into two parts: 
(a) the pattern, i.e., the initial part of the prob- 
lem establishing the rule, and (b) the options, 
i.e., the multiple choice selection of answers. 
The problems, shown at top of next page, were 
chosen and classified as easy, medium, or diffi- 
cult. 

The laboratory procedure was introduced to 
each subject with a brief explanation of the pur- 
pose of the experiment and the principles of the 
camera and audograph. Each examinee was then 
(a) adjusted before the camera, (b) given instruc- 
tions adapted from the ACE, (c) asked to solve 
two practice problems, and (d) individually pre- 
sented the nine problems as an untimedtest. Up- 
on completing the camera procedure the subject 
was (a) seated before the audograph, (b) given 
instructions adapted from the ACE, (c) present- 
ed two practice problems to be solved verbally, 
and (d) administered the problems as an untimed 
test, to which the subject responded by verbaliz- 
ing though’ processes in attempting a solution. 


Discussion of Results 


From Table I it can be seen that the mean 
number of fixations and regressions for Number 
Series problems is less for high performers than 
low performers. This appears to differ with An- 
selmo’s (1) findings, but may be accounted for in 
the more stringent selection of subjects and prob- 
lems for this experiment. Mean total duration of 
fixations and regressions for Number Series prob- 
lems also indicates that the high performers 
spend less time than low performers. However, 
the mean number of correct answers for Number 
Series problems is not very different for the low 
and high performers. Thus without controlling 
the time variable, the two groups performed 
equally well. This was not true when these sub- 
jects were administered the ACE as a timedtest 
from which they were selected as high and low 
performers, respectively. 

Different results emerge (Table I) for the two 
groups tested on Figure Analogies problems. Ex- 
cept for mean number of correct answers, there 
appears to be little or no difference between any 
of the measures. This lack of differences be- 
tween high and low performers is contrary to 
Greening's (4) conclusions. The apparent dif- 
ference between mean number of correct ans- 
wers indicates that even on an untimed basis the 
high performers are superior. Thus the Figure 
Analogies problems tended to function as a pow- 
er test, while this was not true of the Number 
Series problems. The design of the Number Ser- 
ies and Figure Analogies problems doesnot, of 
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(Camera) Number Series (ACE 1941) 
(Camera) Figure Analogies (ACE 1941) 
(Audograph) Number Series (ACE 1940) 
(Audograph) Figure Analogies (ACE 1940) 


course, permit a comparison of the data on 
these two types of problems. 

Figure 1 indicates that for the three levels of 
difficulty of the Number Series and Figure Anal- 
ogies problems, the high performers fixated 
more on the pattern than did the low performers. 
Moreover, both groups fixated more on the pat- 
tern than on the options of all the problems with 
the exception of the low performers on Figure 
Analogies. 

As the Number Series and Figure Analogies 
problems increased in level of difficulty, from 
easy to medium, both the high and low perform- 
ers spent a similarly greater percentage of fix- 
ations establishing the pattern of the problem. 
Specifically, QFA,, approximates 4FA,, and 
ANS; approximates ANS, (Figure 1). However, 
with those problems which increased from a 
medium to a difficult level, the low performers 
fixated less on the pattern, thus relying more on 
an eliminative method selecting an answer from 
the options. In contrast, the high performersin 
their solution of the difficult problems increased 
the number of their fixations on the pattern, thus 
indicating a more analytic method of problem 
solving. Here the differential between ANS, 
and ANS, is less than that between AFA, and 
OFA;. 

Presumably, the effectiveness of the high 
performers’ method of problem solving is not 
appreciably reduced by the increasing difficulty 
levels of those problems administered. Rather 
it demonstrates a proportionately greater inten- 
sity of analysis. (Intensity is here taken to be 
a function of the percentage of fixations on the 
pattern. ) 

Low performers, however, in exhibiting a 
maximum percentage of fixations on the pattern 
for the medium level problems and a regression 
to a lesser percentage for the difficult problems, 
appear to execute a problem solving method 
characterized by a simple integration of past 
experience and the immediate solution of the 
problem. Their method seemingly cannot be in- 
tensified beyond the medium level problems. 
This regressive nature of the low performers’ 
method in comparison with the progressive na- 
ture of the high performers’ method may be a 
significant differential for discriminating pr ob- 
lem solving attacks. 

The percentages of fixations on the patterns 
and options were further analyzed by computing 
the mean percentages of consecutive fixations. 


Medium Difficult 
28, 29, 30 
28, 29, 30 
28, 29, 30 
28, 29, 30 


16,17, 18 
16, 20, 21 
16,17, 18 
16, 20, 21 


The first set of consecutive fixations on the pat- 
tern may indicate on the subjects’ first attempt 
to establish the rule. Whereas, the first set of 
consecutive fixations on the options may indicate 
the subjects first attempt to finda solution 
among the alternatives. The second and third 
sets of consective fixations may be second and 
third attempts to establish the rule or a verifica- 
tion of the first answer. These data were com- 
piled in Table Il. 

On Number Series for high and low perform- 
ers in the first examination the mean percentage 
of consecutive fixations on the patterns (P) was 
four to five times more than on the options (O). In 
the second examination both highand low perform - 
ers fixated consecutively for the average percent- 
age that was two to three times more on the pat- 
terns than on the options. While in the third ex- 
amination, both high and low performers fixated 
consecutively for a more nearly equal mean per- 
centage on patterns and options. High perform- 
ers in the first and second examination fixated 
consecutively on the patterns for about as great 
a mean percentage of the total number of fixa - 
tions as the low performers. But in the third ex- 
amination of the patterns the low performers 
fixated consecutively for a mean percentage that 
was three times greater than that of the high 
performers. Similarly, in the first and second 
examinations of the options the high performers 
fixated consecutively for about as great a mean 
percentage as the low performers. Again, in 
the third examination of the options, the low 
performers fixated consecutively for a meanper- 
centage of the total number of fixations that was 
seven times larger than the high performers’ 
mean percentage. Therefore, in summationthe 
high performers may be said to progress with 
a more thorough pattern analysis, as indicated 
by the greater mean percentage of consecutive 
fixations; i.e., the greater mean percentage of 
fixations on the pattern implieda more complete 
observation, The low performers appeared to 
be satisfied with a less complete analysis, and 
they looked to the options for clues te the an- 
swer; i.e., looking for a specific answer among 
the options would not require as many fixations 
as attempting to eliminate four options. Since 
in the third examination of the patterns the low 
performers fixated consecutively for a mean 
percentage of the total fixations that was three 
times larger than the high performers percent- 
age, two different processes may have been oc- 
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TABLE I 


EYE MOVEMENT MEASURES FOR LOW AND HIGH PERFORMERS ON NUMBER 
SERIES AND FIGURE ANALOGIES 


Number Series Figure Analogies 
Low High 


Méan No. Fixations per Problem 21 


Mean No. Regressions per Problem 9 
Total Dur. Fixation 1/30 Sec. 244 
Total Dur. Regression 1/30 Sec. 98 


Mean No. Correct Answers . . 4.5 


TABLE 
MEAN PERCENTAGE OF CONSECUTIVE FIXATIONS ON PATTERNS AND OPTIONS 


Number Series Figure Analogies 


Mean Percent- 
age of Coneee- High Low High 


utive Fixations 0% PY PY 0% 
I Examination 24.4 21.1 39.9 25.5 


ll Examination \ 10.5 13.4 10.2 9.5 
Il Examination 7.0 7.4 5.4 3.1 
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curring. The low performers appeared to be 
still searching for the rule; the high performers 
may have been verifying or correcting minor 
errors. 

On Figure Analogies low performers fixated 
consecutively in all three examinations on the 
patterns for the same mean percentage as on 
the options. Contrastingly, the high perform- 
ers scored a greater mean percentage of con- 
secutive fixations on the patterns than on the op- 
tions in the first examination, yet in the second 
and third examinations, they scored analmost 
equal mean percentage on the patterns and on 
the options, therein duplicating in part the re- 
sults of the low performers. Similar to the pro- 
cesses suggested by the Number Series data, the 
high performers’ first examination of the Figure 
Analogies may be interpreted as a more thor- 
ough pattern analysis than the lowperformers’. 
But there is a contrast of possible significance 
between the distribution of time in the third ex- 
amination for the Figure Analogies and Number 
Series data; i.e., the difference between the 
mean percentage of consecutive fixations on the 
patterns of high and low performers was not as 
great as the same differential in the Number 
Series data. Moreover the same differential in 
the options in the Figure Analogies data is less 
than in the Number Series data. 

In rating the verbal responses, the mean fre- 
quency of the ascription of items to the four groups 
was computed, Specifically, the mean frequen- 
cies, zero to three, four to six, andseventoten 
inclusive, represented infrequent, moderately 
frequent, and very frequent respective asc rip- 
tions. From this analysis those items rated 
with a mean frequency different (in terms of the 
three categories above) for highand low perform- 
ers on Figure Analogies were found to be the fol- 
lowing: (item 2) identifying differences only, 
(item 3) identifying similarities and differences, 
and (item 9c) answering with the correct solu- 
tion with no error. 

Data for items twoand three (Table Il) streng- 
then the evaluation (from percentage of fixation 
data in Figure 1) of the high performers’ method 
as analytic, wherein they tend to examine only 
differences. Notation of similarities would be 
:ess effective and more divertive. This indicates 
a greater purposefulness in the high performers’ 
method of problem solving than in that of the low 
performers. 

The finding that the item 9c, answering with 
the correct solution with no errors, discriminat- 
ed only between the low and high performers on 
Figure Analogies is corroborated by the eye 
movement data (Table I). From these, the dif- 
ferential in mean numbers of correct answers 
was found to be greater for the low and high per- 
formers on Figure Analogies than on Number 
Series. This indicates a power differential for 
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the low and high performers. 

The finding that there were few rating items 
with a differential mean frequency for low and 
high performers, as yielded by this scale, may 
be attributable to the time factor in the admin- 
istration of the test from which the selection of 
the subjects was made. For example, the effects 
of these variable factors may have largely con- 
tributed to concealing the low performers’ solu- 
tion as characteristically eliminative and that of 
the high performers as characteristically ana- 


lytic. 
Summary and Conclusions 


This experiment was conducted to disclose the 
differential problem solving processes employed 
by high and low performers, respectively, inthe 
solution of Number Series and Figure Analogies 
problems from the ACE Psychological Examina- 
tion. A sample of 31 college students, from two 
classes in Introductory Educational Psychology, 
who had scored less than six or more than 19 
Number Series items, or less than eight or 
more than 19 Figure Analogies items on either 
of the above mentioned sub-tests in which their 
performance was extreme. This examination 
consisted of individual, laboratory testing, first 
before the eye movement, and second with the use 
of an audograph for verbal recording. Eye move- 
ment measures were compiled from the devel- 
oped film, and the verbal recordings were eval- 
uated on a dichotomous rating scale. 

Relevant to the questions investigated, the 
results suggest the following conclusions: 


1. High performers exhibit fewer fixations 
and regressions than low performers in solving 
Number Series problems. High and low perform - 
ers employed a similar number of fixations and 
regressions in Figure Analogies problems. 

2. High performers have a total duration of 
fixations and regressions which is less than the 
low performers’ in solving Number Series prob- 
lems. For Figure Analogies problems the dura- 
tion of fixations and regressions is not very dif- 
ferent for the two groups. 

3. The high performers fixated more on the 
pattern than on the options for all problems of 
the three levels of difficulty. Moreover, their 
percentage of fixations on the pattern increased 
with increasing difficulty level of the patterns. 
For low performers, the former finding was. 
true only on the Number Series problems. The 
percentage of their fixations on the pattern in- 
creased only from the easy to the medium level 
problems then decreased for the difficult prob- 
lems. This was indicated in their performance 
on both Figure Analogies and Number Series 
problems, 

4. There was no substantiation in either the 
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eye movement or verbal-recorded data that 
high performers shift more readily from one 
arithmetic process required to solve a problem 
to another process for a subsequent problem. 


The verification of the questions under inves- 
tigation may have been limited by the assump- 
tion that eye movement responses are in part 
symptomatic of thought processes. This may 
not be true. Rather thought processes may be 
a ‘‘delayed-reaction-expression”’ of the eye 
movement response (Thought I is concurrent in 
time with Fixation 2); i.e., the eye move ment 
versus the thought response may be analogous 
to the eye movement versus the voice span rec~- 
ord, 
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ACADEMIC ATTRITION OF ENGINEERING 
TRANSFER STUDENTS 


J. STANLEY AHMANN 
Cornell University 


PERIODICALLY some of the careful observ- 
ers of the higher education scene express in- 
creasing concern about the relatively high rates 
of academic attrition found at most institutions. 
In spite of any benefits which a student may re- 
ceive from a contact witha college or university, 
be it as little as a single semester, the opinion 
is stated that the failure of sizable percentages 
of students to graduate has resulted in an unnec- 
essary dissipation of energies and finances. 
Were a rechanneling of these efforts possible, 
appreciable gains to both the institutions and the 
students concerned are envisioned. 

An answer to the problem would be, of course, 
a more careful screening of applicants reque st- 
ing entrance to engineering curricula. Efforts 
to find those characteristics which are highly re- 
lated to academic success in engineering have 
been numerous (8). The predictive usefulness 
of the high school grade-point average (9), scho- 
lastic aptitude tests (2,5,11), other aptitude 
tests (10), reading tests (3), interest tests (7), 
and personality scales (6), individually and in 
combination, has been investigated. In many in- 
stances the results have been promising, even 
though incapable of offering a near perfect selec - 
tion scheme. 

In the case of engineering colleges in which 
sizable numbers of students enter as transfer 
students, the problem of selecting the most prom- 
ising students is further complicated. At the 
Iowa State College, for example, estimates have 
been made that as many as 40 percent of the en- 
tering engineering students at the beginning of a 
fall term had received college credit from other 
institutions of higher education. A study (1) of 
the transfer students entering the Engineering 
Division of this college during the 1946-47, 1947 
48, and 1948-49 academic years revealed that 
most students (80%) had attended only one college 
prior to enrolling at the lowa State College, and 
that the college was usually located in lowa or 
an adjacent state and enrolled less than 2500 
students. Furthermore, of the £04 students in- 
cluded in the study, only 246, or 31%, graduat- 
ed in engineering. The remaining 550 either 
failed, transferred to non-engineering curricula 
at the lowa State College, transferred to other 
institutions of higher education, or droppedfrom 
college for miscellaneous personal reasons. 
Even though a few of the 558 may have been aca- 
demically successful elsewhere, they can be 
properly classified as attrition students in the 


eyes of the engineering faculty. 

Although the foregoing study investigated the 
relationships between a series of numerical var- 
iables and the tendency to graduate in engineer - 
ing, no attempt was made to study the possible 
influence of non-numerical characteristics on 
this criterion. An extension of the study, there- 
fore, seemed in order. 

On the basis of a preliminary examination of 
the data available, one of the non-numerical fac- 
tors which seemed to warrant examinat.on was 
the type of institution first attended by the trans- 
fer student. Although this factor was but one of 
many potentially influencial factors, indications 
were found that it was possibly more influential 
than most. Therefore, the following report is 
restricted for the most part to the single consid- 
eration of whether the type of college at which 
a transfer student first matriculated affected his 
tendency to graduate in engineering at the lowa 
State College. 

For purposes of classification, the engineer - 
ing transfer students were considered to have 
matriculated for the first time at one of two dif- 
ferent types of institution, either one offer ing 
only a two-year program or one offering more 
than a two-year program. The hypothesis was 
then posed as to whether, with respect to trans- 
fer students entering engineering curricula atthe 
Iowa State College, those who first matriculated 
at institutions offering only a two-year program 
differed from those who first matriculated at in- 
stitutions offering more than a two-year pro- 
gram in terms of tendency to graduate in engin- 
eering. 

A random sample of 256 male engineering 
transfer students was selected from the 804 stu- 
dents included in the original study. This sample 
was so drawn that students having matriculated 
at both types of institutions were equally repre- 
sented. Furthermore, since earlier research (4) 
demonstrated that students who were veterans of 
World War II tended to surpass non-veteran stu- 
dents in academic achievement, the sample was 
further sub-divided on that basis, thus yielding 
four subgroups with 64 cases included in each 
subgroup. In Table I is shown the number of 
students in each subgroup who graduated in en- 
gineering. 

Inspection of this table revealed that, when 
individual differences in academic aptitude were 
ignored, sizable differences in tendency tograd- 
uate existed. The students first matriculating 
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at an institution with more than a two-year pro- 
gram seemed to graduate in distinctly greater 

numbers than those who first matriculated at in- 
stitutions offering only a two-year program. Al- 
so, veteran students obviously surpassed non - 
veteran students with respect to this criterion. 
Of the four subgroups, the veteran students first 
matriculating at institutions offering more than 

a two-year program seemed definitely to excell. 

To test the significance of the differences in 
tendency to graduate in engineering, an analysis 
of variance can be computed provided an assump- 
tion is made concerning the nature of the gradu- 
ation-attrition dichotomy In this case, the as- 
sumption was made that the tendency tograduate 
in engineering was a single normally distributed 
variable and was no more sensitively measured 
than by the graduation-attrition classification. 
This assumption does, therefore, underlie all of 
the procedures and interpretations made in the 
following paragraphs. 

The steps followed in the computation of the 
analysis of variance were the same as those re- 
ported by Wert, Neidt, and Ahmann (12). Ac- 
cording to this procedure sums of squares were 
found for the main effects of type of college and 
veterans status as well as for the interaction by 
designating as the value to be assigned to each 


member of bre graduation groups, and ; as the 


value to be assigned to each member of the at- 
trition groups. The quantities p and q are the 
proportions of the total sample of 256 students 
who graduated and did not graduate in engineer - 
ing respectively. The value z is the height of 
the ordinate dividing the normal curve of unit 
area inp and q parts. The entries in the analy- 
sis of variance table were then found in some- 
what the same manner as in the problems in 
which a numerical criterion is present. The re- 
sults are summarized in Table II. 

The F-value for the type of college main ef- 
fect failed to meet significance at the 5% level 
by a very slight amount. The conclusion, there- 
fore, was considered to be in doubt. The possi- 
bility remained that those transfer students who 
first matriculated at institutions offering only a 
two-year program did, as a group, experience 
greater difficulty in graduating because of that 
fact. In the case of the remaining two F-values, 
the significance of that for the veteran status 
main effect and the non-significance of the value 
for the interaction were not surprising. 

In the foregoing analysis any individual dif - 
ferences in studentship which might have influ- 
enced tendency to graduate in engineering on the 
part of transfer students have been ignored. To 
investigate the possible influence of type of col- 
lege on transfer students’ tendency to graduate 
in engineering, an analysis corresponding close - 
ly to the analysis of covariance was needed in 
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which individual differences in studentship were 
controlled. 

The quantitative raw scores on the American 
Council on Education Psychological Examination 
and the high school grade-point averages were 
available for all students and were used as indi- 
cators of studentship. The latter values were 
tabulated on an A, B, C, D, and F basis, and 
then converted toa 4, 3, 2, 1, and 0 basis. The 
mean values of both variables are shown in Table 

In all four subgroups the difference between 
the graduation group and the corresponding at - 
trition group was striking with respect to caliber 
of studentship as represented by these two vari- 
ables. Differences between the means of the 
quantitative scores were oftenas greatas 10 points 
and once almost 20 points. Differences between 
the means of the high school grade-point aver- 
ages were usually 0.2 or 0.3. In every instance 
the graduation group surpassed the attrition 
group 

Of additional importance, even though not in- 
cluded as such in Table III], was the fact that, as 
a group, the transfer student representing the 
one type of college differed from those repre - 
senting the other in the following manner. The 
mean quantitative score and mean high school 
grade-point average for the transfer students 
first matriculated at an institution with only a 
two-year program were 61.6 and 2. 62 respec - 
tively. The corresponding values for the trans- 
fer students who first matriculated at institutions 
offering more than atwo-year program were 64.4 
and 2.68. In terms of these two variables, there- 
fore, the institutions offering the longer program 
tended to attract the better students. 

In order to control on the individual diffe r- 
ences in studentship as represented by these two 
measures, the analysis of variance shown in 
Table I] was expanded into a variation of the or- 
dinary analysis of covariance. This variation, 
although much the same as the original analysis 
of covariance, employed modified discriminant 
functions (12) in place of the regression equa - 
tions. The discriminant functions were of the 
same number and type as the regression equa- 
tions used in covariance analysis and served 
much the same function. 

The results of the analysis are shown in Table 
IV. It should be noted that the proportions of the 
individual differences in graduation tendency 
that could be explained by variations in the quan- 
titative scores and the high school grade-point 
averages were computed, andwere then ex- 
pressed as the proportion of the variance rep - 
resenting individual differences in graduation 
tendency not associated with variations in the 
two numerical variables. The resulting values 
were 0.8458, 0.8588, 0.8508, and 0.6510. With 
these known proportions, it was possible to re- 
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turn to the information assembled in the analy- 
sie of variance shown in Table Il, and remove, 
as in the common analysis of covariance, any 
allowance which need be made because of indi- 

vidual differences between the groups on the 

control factors. The adjusted sums of squares 
were converted to mean squares and the F -val- 
ues computed in the usual manner. 

The F-value for the type of college main ef- 
fect failed to reach the 5% level of significance. 
Therefore, insofar as the quantitative scores 
and high school grade-point averages controlled 
individual differences in studentship, and no 
other factors contributed a bias, no significant 
differences have been found in tendency to grad- 
uate in engineering at the lowa State College be- 
tween transfer students first matriculating atan 
institution offering only a two-year program and 
those transfer students first matriculating at in- 
stitutions offering more than a two-year pro- 
gram. It was concluded that the possibility of 
matriculating at an institution offering a broad- 
er program enhanced a transfer student's tend- 
ency to graduate in engineering, as suggested in 
the analysis in Table 0, disappeared when indi- 
vidual differences in studentship were consid- 
ered. As suggested in an earlier paragraph, it 
can be inferred that it was not the type of pro- 
gram offered as such which caused transfer stu- 
dents from two-year programs to tend to have 
greater difficulty in graduating in engineer ing, 
but rather that such institutions seemed to 
enroll less talented students in this instance. 
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COLLEGE LEVEL STUDY SKILLS PROGRAMS: 
SOME OBSERVATIONS 


WALTER S. BLAKE, Jr. 
University of Maryland 


COLLEGE-LEVEL study skills programsare 
becoming more numerous. Twenty-four institu- 
tions are planning such programs for the near 
future. Institutions of higher learning are en - 
rolling anywhere from seven to 1400 students in 
their programs in the United States and Posses- 
sions, and all programs in which evaluations 
have been undertaken report favorable re sults. 
However, most of the programs seem to resem- 
ble ‘‘Topsy’’ somewhat—they just ‘‘growed up’’ 
without the benefit of the experiences of others 
by virtue of the fact that the experiences of oth- 
ers in this field have not been reported in the lit- 
erature in any appreciable measure. 

The University of Maryland began a program 
in 1947, and it, too, grew out of experimenta- 
tion at Maryland largely rather than as a result 
of the experiences of workers in other programs. 
However, a study was undertaken in 1953 to sur- 
vey and evaluate both the program at the Univer- 
sity of Maryland and other programs in opera- 
tion throughout the United States and Posse s - 
sions in order to improve the program atthe Un- 
iversity of Maryland in the light of the findings.* 
The workers in the University of Maryland pro- 
gram feel that at least part of what they found 
out could benefit workers in other programs, 
and so the following highlights of the findings 
and recommendations from the study are pre- 
sented in the hope that the many program work- 
ers and their students will find the information 
useful to them. 


1. Most programs offer services to a limited 
segment of the school population. Forty-two 
and two-tenths percent admit voluntary and ref- 
erral students (probationers, etc.); 40% admit 
only voluntary students, and 11. 1% require all 
freshmen to enroll (with a few taking voluntary 
students as well). Six percent did not report in 
this area. The wide variation of admission pol- 
icies is surprising since the consensus is that 
any study skills program is composed of guid- 
ance services which should be available to the 
entire student body if the program is to attain 
its greatest effectiveness. All entering f{resh- 
men should be assigned to a program designed 
to indoctrinate them to the life on campus plus 
the minimal skills needed to achieve their goals 


at college and afterward; and the servicesof the 
program (tutorial, remedial reading, study 
skills and reading courses, counseling etc.,) 
should be open to all students on campus who 
feel a need for such services. 

2. The ‘‘remedial’’ aura still surrounds and 
plagues study skills programs, in general. The 
remedial phase(s) of most programs take pre- 
cedence over the preventative phases, with the 
result that very few schools make provisions 
for helping any students other than those who 
must be helped. The ‘‘average’’ student is 
obliged to struggle along without assistance un- 
til he, or some faculty member, notices that he 
is about to fail out of college, at which time ‘‘re- 
medial’’ measures may be taken (if it is not al- 
ready too late). In most institutions where no 
required program for freshmen is offered, fac- 
ulty referrals and self-referrals are the only 
means available to help prevent academic fail- 
ure and social maladjustment. 

3. Program-planning with students is con - 
piculously lacking in many of the programs sur- 
veyed. Small staffs and insufficient operating 
funds usually account for this; yet the absence 
of student-faculty planning is a serious short- 
coming, nonetheless, in programs of this kind. 
The types and extent of services offered should 
be the result of student-faculty planning, based 
upon research findings. One way to help insure 
student participation in the program is to incor- 
porate student-faculty planning as a part of the 
program itself. Written student evaluations, 
soliciting student suggestions, interviews with 
students, consultation with student government 
leaders, and regularly scheduled student-faculty 
meetings are useful methods. The main point 
here is this: faculty-seen needs are not neces- 
sarily student-seen needs—a well-known fact 
often overlooked. It is recognized that a well- 
trained faculty might know more about what stu- 
dents need than the students themselves, yet this 
obviously does not guarantee student acceptance 
of a program planned entirely by faculty mem- 
bers. Student-faculty planning might well be 
termed a ‘‘calculated risk’’ in the study skills 
area; but it seems no less essential than 
in any other situation where democratic proced- 
ures seem likely to produce the best results. 


#This article is based upon a doctoral study completed by the author entitled: A Surrey and typ ae 
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4. Research is being done neither inthe min- 
imal quantity necessary nor in the areas where 
it is most needed. The quantity of research 
needed will necessarily be governed by needs of 
individual programs; but every program needs 
research of the kind which will indicate (1) wheth- 
er the program is achieving set goals, and (2) 
what needs to be done to improve the program. 
While it is true that program workers spend 
most of their time giving service (as do most 
people in the various branches of the teaching 
profession), it is equally true thata part of 
every worker's time needs to be devoted to re- 
seacch in the program if the program is to be 
successful, and if the workers are to have con- 
fidence in the program itself as well as their 
part in the program. Research is needed par- 
ticularly in these areas: program evaluation, 
program improvement, and validation of diag- 
nostic instruments. 

5. Over half (51. 1%) of the programs sur - 
veyed do not give academic credit for participa- 
tion in the formalized parts (classes in study 
skills and reading, mainly) of the programs. 
Credit is ‘‘expected’’ by college students for 
work done under the auspices of the institution 
out of habit and tradition. Good or bad, it is 
nonetheless true that college credit is a motivat- 
ing factor with college students—perhaps the 
most important single motivating factor. It is 
also true that student initiative is important to 
any student's success or failure in meeting or 
solving his problems. Therefore, it seems im- 
portant to make the process of problem-solving 
in any group guidance situation as profitable as 
possible to students in order to nurture initia- 
tive. Some workers who do not give academic 
credit feel that some of the services rendered 
and some of the course materials and techniques 
used are not ‘‘college level’’ in terms of the con- 
ventional college-level courses. While such may 
actually be the case in many programs, the fail- 
ure to grant some credit for work accomplished 
may doom good programs to ineffectuality, no 
matter how fine such programs may be potenti- 
ally. 

6. Study skills programs need workers 
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trained to work in study skills programs. At 
present, nearly all workers are educators, psy- 
chologists, or other kinds of specialists not nec - 
essarily trained to be workers in study skills 
programs. Workers having majored in areas 
such as education and psychology might have 
some of the general qualifications needed (like 
the desire to work with students); but workers 
could have the special qualifications needed only 
by chance. For example, educators do not us- 
ually learn psychology in their curriculum, and 
psychologists do not learn teaching methods; 
yet both psychology and teaching methods are 
acknowledged to be two of the important special 
qualifications desirable for program workers by 
program workers themselves. Only one institu- 
tion, out of the many contacted in the survey, of- 
fers a training program specifically for study 
skills program workers, yet hundreds of per - 
sons are now employed in such programs, and 
24 institutions plan such programs for the future. 
7. Study skills programs are not publicized 
adequately, as a rule— indeed, some are kept 
on a “‘confidential’’ basis among staff members. 
The reticence on the part of the program work- 
ers to make their services known does a disserv- 
ice to the student body and also prevents the pro- 
grams from reaching their maximum level of ef- 


 fectiveness. Evidence points to frugal financing 


of such programs as the basic reason for cur- 
tailed services as well as lack of publicity about 
services offered; but it seems certain that a 
program designed to help students cannot be kept 
secret from students and at the same time serve 
their needs. The publicizing of programs need 
not be of the conventional advertising variety, of 
course; but the program should be made known 
to all students through written notices concern- 
ing services, hours, etc., articles in the cam- 
pus literature which will reach and be read by 
both students and faculty, and any other device 
available to workers. The students and faculty 
who have received satisfactory service provided 
by the program will, of course, be the best pub- 
licity mediums, once the program has been op- 
erating long enough to become known on the 
campus. 
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1. All manuscripts must be typewritten, double spaced, and on one side 
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